Skip to main content

Full text of "Natural Philosophy Of Cause And Chance"

See other formats





This book should be returned brfor 

ate fast riiarked bdow, 

















Oxford University Press, Amen House, London E.G. 4 


Geoffrey Cumberlege, Publisher to the University 


A DBAFT of these lectures, written before they were delivered, 
contained considerably more technicalities and mathematics than 
the present text. Facing a large audience in which physicists 
and mathematicians were presumably a minority, I had to 
change my plans and to improvise a simplified presentation. 
Though this did not seem difficult on the platform of the Hall 
of Magdalen College, Oxford, the final formulation for publi- 
cation was not an easy task. I did not like replacing rigorous 
mathematical reasoning by that mixture of literary style, 
authority, and mystery which is often used by popularizing and 
philosophizing scientists. Thus, the idea occurred to me to 
preserve the mathematics by removing it to a detailed appendix 
which could also contain references to the literature. The vast 
extension of the latter, however, compelled me to restrict quota- 
tions to recent publications which are not in the text-books. 
Some of these supplements contain unpublished investigations 
of my school, mainly by my collaborator Dr. H. S. Green. In the 
text itself I have given up the original division into seven lec- 
tures and replaced it by a more natural arrangement into ten 

I have to thank Dr. Green for his untiring help in reading, 
criticizing, and correcting my script, working out drafts of the 
appendix, and reading proofs. I am also indebted to Mr. Lewis 
Elton not only for proof-reading but for carefully preparing the 
index. I have further to thank Albert Einstein for permission 
to publish soctions of two of his letters. 

My most sincere gratitude is due to the President and the 
Fellows of Magdalen College who gave me the opportunity to 
plan these lectures, and the leisure to write them down for 

I wish to thank the Oxford University Press for the excellent 
printing and their willingness to follow all my wishes. 

M. B. 





MECHANICS ...... 10 


TION ....... 17 









1. Multiple causes (II, 8) . . . . .129 

2. Derivation of Newton's law from Kepler's laws (III, 13) . 129 

3. Cauchy's mechanics of continuous media (IV, 20) . .134 

4. Maxwell's equations of the electromagnetic field (FV H 24) . 138 

5. Relativity (IV, 27) . . . . . .141 

6. On classical and modern thermodynamics (V, 38) . . 143 

7. Theorem of accessibility (V, 39) . . . 144 

8. Thermodynamics of chemical equilibrium (V, 43) . .146 

9. Velocity of sound in gases (V, 44) .... 149 

10. Thermodynamics of irreversible processes (V, 45) . .151 

11. Elementary kinetic theory of gases (VI, 47) . . .152 

12. Statistical equilibrium (VI, 50) .... 153 

13. Maxwell's functional equation (VI, 51) . . .153 

14. The method of the most probable distribution (VI, 52) . 154 

15. The method of mean values (VI, 54) ... 160 


16. Boltzmann's collision integral (VI, 56) . . .163 

17. Irreversibility in gases (VI, 57) . . . .165 

18. Formalism of statistical mechanics (VI, 60) . .167 

19. Quasi-periodicity (VI, 62) . . . . . 169 

20. Fluctuations and Brownian motion (VI, 63). . .170 

21. Reduction of the multiple distribution function (VI, 67) . 176 

22. Construction of the multiple distribution function (VI, (>8) . 177 

23. Derivation of the collision integral from the general theory of 

fluids (VI, 69) ....".. 180 

24. Irreversibility in fluids (VII, 72) . . . 183 

25. Atomic physics (VIII, 75) . . . . . 187 

26. The law of equipartition (VIII, 77) . . . . 188 

27. Operator calculus in quantum mechanics (VIII, 91) . . 188 

28. General formulation of the uncertainty principle (IX, 94) . 189 

29. Dirac's derivation of the Poisson brackets in quantum 

mechanics (IX, 97) ..... 191 

30. Perturbation theory for the density matrix (IX, 100) . 193 

31. The functional equation of quantum statistics (IX, 112) . 197 

32. Degeneration of gases (IX, 113) . . . .197 

33. Quantum equations of motion (IX, 116) . . . 203 

34. Supraconductivity (IX, 118) . . . .206 

35. Economy of thinking (X, 124) . . . .207 

36. Concluding remarks (X, 127) . . . .207 

INDEX . . . . . . .211 


THE practice of representing vector quantities by means of clarendon 
type in print is now well established, and used throughout these lectures. 
For dealing with cartesian tensors, the notation of Chapman and Milne, 
explained in the first chapter of Chapman and Cowling's book The 
Mathematical Theory of Non-Uniform Oases (C.U.P., 1939) is used; this 
consists in printing tensors in sans serif type. 

The following examples will suffice to show how vector and tensor 
equations are translated into coordinate notation. 

a = b - a k = b k (k = 1, 2, 3) 


a.b = 


a u ~ 6fcj (M = 1,2,3) 


2a w &j = c k (k = 1,2,3) 


a.b.c = 2 a k^ki c i 
aAb = c~>a a 6 3 -~a 3 62 = c i> tc. 

= 6 fc (fc = 1,2,3) 

,. V&fy 
diva= > --* 

Z^dx k 

.a == diva = b-^ ^ = ( diva )* = 6 fc (* = 

oX ^lw ^1 


= b 19 etc. 


THE notions of cause and chance which I propose to deal with 
in these lectures are not specifically physical concepts but have 
a much wider meaning and application. They are used, more or 
less vaguely, in everyday life. They appear, not only in all 
branches of science, but also in history, psychology, philosophy, 
and theology ; everywhere with a different shade of meaning. 
It would be far beyond my abilities to give an account of all 
these usages, or to attempt an analysis of the exact significance 
of the words 'cause 5 and 'chance' in each of them. However, 
it is obvious that there must be a common feature in the use of 
these notions, like the theme in a set of variations. Indeed, 
cause expresses the idea of necessity in the relation of events, 
while chance means just the opposite, complete randomness. 
Nature, as well as human affairs, seems to be subject to both 
necessity and accident. Yet even accident is not completely 
arbitrary, for there are laws of chance, formulated in the mathe- 
matical theory of probability, nor can the cause-effect relation 
be used for predicting the future with certainty, as this would 
require a complete knowledge of the relevant circumstances, 
present, past, or both together, which is not available. There 
seems to be a hopeless tangle of ideas. In fact, if you look 
through the literature on this problem you will find no satisfac- 
tory solution, no general agreement. Only in physics has a 
systematic attempt been made to use the notions of cause and 
chance in a way free from contradictions. Physicists form their 
notions through the interpretation of experiments. This method 
may rightly be called Natural Philosophy, a word still used for 
physics at the Scottish universities. In this sense I shall attempt 
to investigate the concepts of cause and chance in these lectures. 
My material will be taken mainly from physics, but I shall try 
to regard it with the attitude of the philosopher, and I hope that 
the results obtained will be of use wherever the concepts of 
cause and chance are applied. I know that such an attempt will 
not find favour with some philosophers, who maintain that 



science teaches only a narrow aspect of the world, and one which 
is of no great importance to man's mind. It is toie that many 
scientists are not philosophically minded and have hitherto 
shown much skill and ingenuity but little wisdom. I need hardly 
to enlarge on this subject. The practical applications of science 
have given us the means of a fuller and richer life, but also the 
means of destruction and devastation on a vast scale. Wise 
men would have considered the consequences of their activities 
before starting on them ; scientists have failed to do so, and only 
recently have they become conscious of their responsibilities to 
society. They have gained prestige as men of action, but they 
have lost credit as philosophers. Yet history shows that science 
has played a leading part in the development of human thought. 
It has not only supplied raw material to philosophy by gathering 
facts, but also evolved the fundamental concepts on how to deal 
with them. It suffices to mention the Copernican system of the 
universe, and the Newtonian dynamics which sprang from it. 
These originated the conceptions of space, time, matter, force, 
and motion for a long time to come, and had a mighty influence 
on many philosophical systems. It has been said that the meta- 
physics of any period is the offspring of the physics of the pre- 
ceding period. If this is true, it puts us physicists under the 
obligation to explain our ideas in a not-too-technical language. 
This is the purpose of the following lectures on a restricted 
though important field. I have made an attempt to avoid 
mathematics entirely, but failed. It would have meant an un- 
bearable clumsiness of expression and loss of clarity. A way out 
would have been the reduction of all higher mathematics to 
elementary methods in Euclidean style following the cele- 
brated example of Newton's Principia. But this would even 
have increased the clumsiness and destroyed what there is of 
aesthetic appeal. I personally think that more than 200 years 
after Newton there should be some progress in the assimilation of 
mathematics by those who are interested in natural philosophy. 
So I shall use ordinary language and formulae in a suitable 
mixture; but I shall not give proofs of theorems (they are 
collected in the Appendix). 


In this way I hope to explain how physics may throw some 
light on a problem which is not only important for abstract 
knowledge but also for the behaviour of man. An unrestricted 
belief in causality leads necessarily to the idea that the world is 
an automaton of which we ourselves are only little cog-wheels. 
This means materialistic determinism. It resembles very much 
that religious determinism accepted by different creeds, where 
the actions of men are believed to be determined from the 
beginning by a ruling of God. I cannot enlarge on the difficulties 
to which this idea leads if considered from the standpoint of 
ethical responsibility. The notion of divine predestination 
clashes with the notion of free will, in the same way as the 
assumption of an endless chain of natural causes. On the other 
hand, an unrestricted belief in chance is impossible, as it cannot 
be denied that there are a great many regularities in the world ; 
hence there can be, at most, 'regulated accident'. One has to 
postulate laws of chance which assume the appearance of laws 
of nature or laws for human behaviour. Such a philosophy 
would give ample space for free will, or even for the willed 
actions of gods and demons. In fact, all primitive polytheistic 
religions seem to be based on such a conception of nature : things 
happening in a haphazard way, except where some spirit inter- 
feres with a purpose. We reject to-day this demonologioal 
philosophy, but admit chance into the realm of exact science. 
Our philosophy is dualistic in this respect ; nature is ruled by 
laws of cause and laws of chance in a certain mixture. How is 
this possible? Are there no logical contradictions? Can this 
mixture of ideas be cast into a consistent system in which all 
phenomena can be adequately described or explained ? What 
do we mean by such an explanation if the feature of chance is 
involved ? What are the irreducible or metaphysical principles 
inyolved ? Is there any room in this system for free will or for 
the interference of deity ? These and many other questions can 
be asked. I shall try to answer some of them from the stand- 
point of the physicist, others from my philosophical convictions 
which are not much more than common sense improved by 
Sporadic reading. The statement, frequently made, that modern 


physics has given up causality is entirely unfounded. Modem 
physics, it is true, has given up or modified many traditional 
ideas ; but it would cease to be a science if it had given up the 
search for the causes of phenomena. I found it necessary, 
therefore, to formulate the different aspects of the fundamental 
notions by giving definitions of terms which seem to me in agree- 
ment with ordinary language. With the help of these concepts, 
I shall survey the development of physical thought, dwelling 
here and there on special points of interest, and I shall try to 
apply the results to philosophy in general. 



THE concept of causality is closely linked with that of determin- 
ism, yet they seem to me not identical. Moreover, causality is 
used with several different shades of meaning. I shall try to dis- 
entangle these notions and eventually sum them up in definitions. 
The cause-effect relation is used mainly in two ways ; I shall 
illustrate this by giving examples, partly from ordinary life, 
partly from science. Take these statements : 

* Overpopulation is the cause of India's poverty/ 

'The stability of British politics is caused by the institution 

of monarchy.' 

'Wars are caused by the economic conditions.' 

'There is no life on the moon because of the lack of an 

atmosphere containing oxygen.' 

'Chemical reactions are caused by the affinity of molecules.' 

The common feature to which I wish to draw your attention 
is the fact that these sentences state timeless relations. They 
say that one thing or one situation A causes another J5, meaning 
apparently that the existence of B depends on A, or that if A 
were changed or absent, B would also be changed or absent. 
Compare these statements with the following : 

'The Indian famine of 1946 was caused by a bad harvest/ 
'The fall of Hitler was caused by the defeat of his armies.' 
' The American war of secession was caused by the economic 

situation of the slave states/ 

'Life could develop on earth because of the formation of an 

atmosphere containing oxygen/ 

' The destruction of Hiroshima was caused by the explosion 

of an atomic bomb/ 

In these sentences one definite event A is regarded as the 
cause of another B ; both events are more or less fixed in space 
and time. I think that these two different shades of the cause- 
effect relation are both perfectly legitimate. The common factor 


is the idea of dependence, which needs some Comment. This 
concept of dependence is clear enough if the two things con- 
nected are concepts themselves, things of the mind, like two 
numbers or two sets of numbers ; then dependence means what 
the mathematician expresses by the word 'function'. This 
logical dependence needs no further analysis (I even think it 
cannot be further analysed). But causality does not refer to 
logical dependence ; it means dependence of real things of nature 
on one another. The problem of what this means is not simple 
at all. Astrologers claim the dependence of the fate of human 
beings on the constellations of stars. Scientists reject such state- 
ments but why? Because science accepts only relations of 
dependence if they can be verified by observation and experi- 
ment, and we are convinced that astrology has not stood this 
test. Science insists on a criterion for dependence, namely 
repetitive observation or experiment : either the things A and B 
refer to phenomena, occurring repeatedly in Nature and being 
sufficiently similar for the aspect in question to be considered 
as identical; or repetition can be artificially produced by 

Observation and experiment are crafts which are systemati- 
cally taught. Sometimes, by a genius, they are raised to the 
level of an art. There are rules to be observed: isolation of the 
system considered, restriction of the variable factors, varying 
of the conditions until the dependence of the effect on a single 
factor becomes evident ; in many cases exact measurements and 
comparison of figures are essential. The technique of handling 
these figures is a craft in itself, in which the notions of chance 
and probability play a decisive part we shall return to this 
question at a later stage. So it looks as if science has a methodical 
way of finding causal relations without referring to any meta- 
physical principle. But this is a deception. For no observation 
or experiment, however extended, can give more than a finite 
number of repetitions, and the statement of a law B depends on 
A always transcends experience. Yet this kind of state- 
ment is made everywhere and all the time, and sometimes from 
scanty material. Philosophers call it Inference by Induction, 


and have developed many a profound theory of it. I shall not 
enter into a discussion of these speculations. But I have to 
make it clear why I distinguish this principle of induction from 
causality. Induction allows one to generalize a number of observa- 
tions into a general rule : that night follows day and day follows 
night, or that in spring the trees grow green leaves, are induc- 
tions, but they contain no causal relation, no statement of 
dependence. The method of inductive thinking is more general 
than causal thinking ; it is used in everyday life as a matter of 
course, and it applies in science to the descriptive and experi- 
mental branches as well. But while everyday life has no definite 
criterion for the validity of an induction and relies more or less 
on intuition, science has worked out a code, or rule of craft, for 
its application. This code has been entirely successful, and I 
think that is the only justification for it just as the rules of the 
craft of classical music are only justified by full audiences and 
applause. Science and art are not so different as they appear. 
The laws in the realms of truth and beauty are laid down by the 
masters, who create eternal works. 

Absolute values are ideals never reached. Yet I think that 
the common effort of mankind has approached some ideals in 
quite a respectable way. I do not hesitate to call a man foolish 
if he rejects the teaching of experience because no logical proof 
is forthcoming, or because he does not know or does not accept 
the rules of the scientific craft. You find such super-logical 
people sporadically among pure mathematicians, theologians, 
and philosophers, while there are besides vast communities of 
people ignoijant of or rejecting the rules of science, among them 
the members of anti-vaccination societies and believers in 
astrology. It is useless to argue with them ; I cannot compel 
them to accept the same criteria of valid induction in which I 
believe: the code of scientific rules. For there is no logical 
argument for doing so ; it is a question of faith. In this sense I 
am willing to call induction a metaphysical principle, namely 
something beyond physics. 

After this excursion, let us return to causality and its two 
ways of application, one as a timeless relation of dependence, 


the other as a dependence of one event fixed in time and space 
on another (see Appendix, 1). I think that the abstract, timeless 
meaning of causality is the fundamental one. This becomes 
quite evident if one tries to use the term in connexion with a 
specific case without implicit reference to the abstraction. For 
example: The statement that a bad harvest was the cause of the 
Indian famine makes sense only if one has in mind the timeless 
statement that bad harvests are causes of famines in general. 
I leave it to you to confirm this with the other examples I have 
given or with any more you may invent. If you drop this refer- 
ence to a general rule, the connexion between two consecutive 
events loses its character of causality, though it may still retain 
the feature of perfect regularity, as in the sequence of day and 
night. Another example is the time-table of a railway line. You 
can predict with its help the arrival at King's Cross of the 
10 o'clock from Waverley ; but you can hardly say that the time- 
table reveals a cause for this event. In other words, the law 
of the time-table is deterministic : You can predict future events 
from it, but the question 'why ? ' makes no sense. 

Therefore, I think one should not identify causality and 
determinism. The latter refers to rules which allow one to 
predict from the knowledge of an event A the occurrence of an 
event B (and vice versa), but without the idea that there is a 
physical timeless (and spaceless) link between all things of the 
kind A and all things of the kind J5. I prefer to use the ex- 
pression 'causality' mainly for this timeless dependence. It is 
exactly what experimentalists and observers mean when they 
trace a certain phenomenon to a certain cause by systematic 
variation of conditions. The other application of the word to 
two events following one another is, however, in so common use 
that it cannot be excluded. Therefore I suggest that it should be 
used also, but supplemented by some 'attributes' concerning 
time and space. It is always assumed that the cause precedes the 
effect ; I propose to call this the principle of antecedence. Further, 
it is generally regarded as repugnant to assume a thing to cause an 
effect at a place where it is not present, or to which it cannot be 
Jinked by other things ; I shall call this the principle of contiguity. 


I shall now try to condense these considerations in a few 

Determinism postulates that events at different times are 
connected by laws in such a way that predictions of un- 
known situations (past or future) can be made. 

By this formulation religious predestination is excluded, since 
it assumes that the book of destiny is only open to God. 

Causality postulates that there are laws by which the occur- 
rence of an entity B of a certain class depends on the 
occurrence of an entity A of another class, where the word 
' entity ' means any physical object, phenomenon, situation, 
or event. A is called the cause, B the effect. 

If causality refers to single events, the following attributes 
of causality must be considered : 

Antecedence postulates that the cause must be prior to, or at 

least simultaneous with, the effect. 
Contiguity postulates that cause and effect must be in spatial 

contact or connected by a chain of intermediate things in 




I SHALL now illustrate these definitions by surveying the 
development of physical science. But do not expect an ordinary 
historical treatment. I shall not describe how a great man 
actually made his discoveries, nor do I much care what he him- 
self said about it. I shall try to analyse the scientific situation 
at the time of the discovery, judged by a modern mind, and 
describe them in terms of the definitions given. 

Let us begin with the oldest science, astronomy. Pre- 
Newtonian theory of celestial motions is an excellent example 
of a mathematical and deterministic, yet not causal, description. 
This holds for the Ptolemaic system as well as for the Copernican, 
including Kepler's refinements. Ptolemy represented the motion 
of the planets by kinematic models, cycles, and epicycles rolling 
on one another and on the fixed heavenly sphere. Copernicus 
changed the standpoint and made the sun the centre of cyclic 
planetary motion, while Kepler replaced the cycles by ellipses. 
I do not wish to minimize the greatness of Copernicus' step in 
regard to the conception of the Universe. I just consider it from 
the standpoint of the question which we are discussing. Neither 
Ptolemy nor Copernicus nor Kepler states a cause for the be- 
haviour of the planets, except the ultimate cause, the will of the 
Creator. What they do is, in modern mathematical language, 
the establishment of functions, 

*i=/i(*), = /(*). 

for the coordinates of all particles, depending on time. Coperni- 
cus himself claimed rightly that his functions, or more accurately 
the corresponding geometrical structures, are very much simpler 
than those of Ptolemy, but he refrained from advocating the 
cosmological consequences of his system. This question came 
to the foreground long after his death mainly by Galileo's tele- 
scopic observations, which revealed in Jupiter and his satellites 
a repetition of the Copernican system on a smaller scale. 


Descartes's cosmology can be regarded as an early attempt to 
establish causal laws for the planetary orbits by assuming a 
complicated vortex motion of some kind of ether, and it is 
remarkable that this construction satisfies contiguity. But it 
failed because it lacked the main feature of scientific progress ; 
it was not based on a reasonable induction from facts. Of course, 
no code of rules existed, nor did Descartes's writings provide it 
at that time. The principles of the code accepted to-day are 
implicitly contained in the works of Galileo and Newton, who 
demonstrated them with their actual discoveries in physics and 
astronomy in the same way as Haydn established the rules of 
the sonata by writing lovely music in this form. 

Galileo's work precedes Newton's not only in time but also 
in logical order ; for Galileo was experimenting with terrestrial 
objects according to the rules of repetition and variation of con- 
ditions, while Newton's astronomical material was purely obser- 
vational and restricted. Galileo observed how a falling body 
moves, and studied the conditions on which the motion depends. 
His results can be condensed into the well-known formula for the 
vertical coordinate of a small body or 'particle' as a function of 

time > z = -fetf 2 , (3.1) 

where g is a constant, i.e. independent not only of time but also 
of the falling body. The only thing this quantity g can depend 
upon is the body towards which the motion is taking place, the 
earth a conclusion which is almost too obvious to be formu- 
lated ; for if the motion is checked by my hand, I feel the weight 
as a pressure directed downwards towards the earth. Hence the 
constant g nlust be interpreted as a property of the earth, not of 
the falling body. 

Using Newton's calculus (denoting the time derivative by a 
dot) and generalizing for all three coordinates, one obtains the 
equations ^ Oj ^ = 0> z = -0, (3.2) 

which describe the trajectories of particles upon the earth with 
arbitrary initial positions and velocities. 

These formulae condense the description of an infinite number 
of orbits and motions in one single simple statement: that some 


property of the motion is the same for the whole class, indepen- 
dent of the individual case, therefore depending only on the one 
other thing involved, namely the earth. Hence this property, 
namely the vertical acceleration, must be 'due to the earth', or 
'caused by the earth', or e a force exerted by the earth'. 

This word 'force' indicates a specification of the general 
notion of cause, namely a measurable cause, expressible in 
figures. Apart from this refinement, Galileo's work is just a case 
of ordinary causality in the sense of my definition. 

Yet the law (3.2) involves time, since the 'effect' of the force 
is an acceleration, the rate of change of velocity in time. This 
is the actual result of observation and measurement, and has no 
metaphysical root whatsoever. A consequence of this fact is the 
deterministic character of the law (3.2) : if the position and the 
velocity of a particle are given at any time, the equations deter- 
mine its position and velocity at any other time. 

In fact, any other time in the past or future. This shows that 
Galileo's law does not conform to the postulate of antecedence : 
a given initial situation cannot be regarded as the cause of a 
later situation, because the relation between them is completely 
symmetrical; each determines the other. This is closely con- 
nected with the notion of time which Galileo used, and which 
Newton took care to define explicitly. 

The postulate of contiguity is also violated by Galileo's law 
since the action of the earth on the moving particle needs 
apparently no contact. But this question is better discussed in 
connexion with Newton's generalization. 

Newton applied Galileo's method to the explanation of celestial 
motions. The material on which he based his deductions was 
scanty indeed; for at that time only six planets (including the 
earth) and a few satellites of these were known. I say 'deduc- 
tions', for the essential induction had already been made by 
Kepler when he announced his three laws of planetary motion as 
valid for planets in general. The first two laws, concerning the 
elliptic shape of the orbit and the increase of the area swept by 
the radius vector, were based mainly on Tycho Brahe's observa- 
tions of Mars, i.e. of one single planet. Generalized by a sweeping 


induction to any planet they are, according to Newton, equiva- 
lent to the statement that the acceleration is always directed 
towards the sun and varies inversely as the square of the distance 
r from the sun, /zr~ 2 , where p, is a constant which may differ 
from planet to planet. But it is the third law which reveals the 
causal relation to the sun. It says that the ratio of the square 
of the period and the cube of the principal axis is the same for 
all planets induced from data about the six known planets. 
This implies, as Newton showed (see Appendix, 2), that the 
constant p, is the same for all planets. Hence as in Galileo's case, 
it can depend only on the single other body involved, the sun. 
In this way the interpretation is obtained that the centripetal, 
acceleration ju,r~ 2 is 'due to the sun', or 'caused by the sun', or 
'a force exerted by the sun'. 

The moon and the other planetary satellites were then the 
material for the induction which led to the generalization of a 
mutual attraction of all bodies towards one another. The most 
amazing step, rightly admired by Newton's contemporaries and 
later generations, was the inclusion of terrestrial bodies in the 
law derived from the heavens. This is in fact the idea symbolized 
by the apocryphal story of the falling apple : terrestrial gravity 
was regarded by Newton as identical with celestial attraction. } 
By applying his laws of motion to the system earth-moon, he 
could calculate Galileo's constant of gravity g from geodetical 
and astronomical data : namely, 

where r is the* radius of the earth, R the distance between thei 
centres of earth and moon, and T the time of revolution ofj 
the moon (sidereal month). ! 

The general equations for the motion of n particles under 
mutual gravitation read in modern vector notation 



where r a is the position vector of the particle a (a = 1, 2,..., n), 
r^p = |r a rp| the distance of two particles a and j3, V the 
potential of the gravitational forces. 

Newton also succeeded in generalizing the laws of motion for 
other non-gravitational forces by introducing the notion of mass, 
or more precisely of inertial mass. Newton's method of repre- 
senting his results in an axiomatic form does not reveal the way 
he obtained them. It is, however, possible to regard this step as 
a case of ordinary causality derived by induction. One has to 
observe the acceleration of different particles produced by the 
same non-gravitational (say elastic) forces at the same point of 
space ; they are found to differ, but not in direction, only in 
magnitude. Therefore, one can infer by induction the existence 
of a scalar factor characteristic for the resistance of a particle 
against acceleration or its inertia. This factor is called 'mass'., 
It may still depend on velocity as is assumed in modern theory 
of relativity. This can be checked by experiment, and as in 
Newton's time no such effect could be observed, the mass was 
regarded as a constant. 

Then the generalized equations of motions read 

w a r a = F a , (3.6) 

where m a (a = 1, 2,..., n) are the masses and F a the force vectors 
which depend on the mutual distances r^p of all the particles. 
As in the case of gravitation, they may be derivable from a 
potential V by the operation 

F a =-grad a F, (3.7) 

where V is a function of the r^. The most general form of V for 
forces inverse to the square of the distance would be ( 2' means 


summing over all a, /3 except a = /?) 

where n a p are constants; comparison with (3.4) and (3.5) shows, 
however, that these must have the form 

V>B = >W- ( 3 - ) 


Newton applies further a law of symmetry, stated axiomatic- 
ally, namely that action equals reaction, or /Lt a p = /*0 a , from 

which follows 

pp = Kmp, (3.10) 

where K is a universal constant, the constant of gravitation. 
Hence the constants of attraction or gravitational masses /Z Q 
are proportional to the inertial masses m a . 

Neither Newton himself, nor many generations of physicists 
and astronomers after him have paid much attention to the law 
expressed by (3.10). Astronomical observations left little doubt 
that it was correct, and it was proved by terrestrial observations 
(with suitable pendulums) to hold with extreme accuracy 
(Eotvos and others). Two centuries went by before Einstein saw 
the fundamental problem contained in the simple equation 
(3.10), and built on it the colossal structure of his theory of 
general relativity, to which we have to return later. 

But this is not our concern here. We have to examine 
Newton's equations from the standpoint of the principle of 
causality. I hope I have made it clear that they imply the notion 
of cause exactly in the same sense as it is always used by the 
experimentalist, namely signifying a verifiable dependence of 
one thing or another. Yet this one thing is, in Galileo's and 
Newton's theory alike, a peculiar quantity, namely an accelera- 
tion. The peculiarity is not only that it cannot be seen or read 
from a measuring tape, but that it contains the time implicitly. 
In fact, Newton's equations determine the motion of a system 
in time completely for any given initial state (position and 
velocity of all particles involved). In this way, ' causation ' leads 
to ' determination ', not as a new metaphysical principle, but as a 
physical fact, like any other. However, just as in Galileo's 
simpler case, so here the relation between two consecutive con- 
figurations of the system is mutual and symmetrical. This has a 
bearing on the question whether the principle of antecedence 
holds. As this applies, according to our definition, only to the 
cause-effect relation between single events, one has to change 
the standpoint. Instead of considering the acceleration of one 
body to be caused by the other bodies, one considers two 


consecutive configurations of the whole system and asks whether 
it makes sense to call the earlier one the cause" of the later one. 
But it makes no sense, for the relation between the two states is 
symmetrical. One could, with the same right, call the later 
configuration the cause of the earlier one. 

The root of this symmetry is Newton's definition of time. 
Whatever he says about the notion of time (in Principia, 
Scholium I) as a uniform flow, the use he makes of it contains 
nothing of a flow in one direction. Newton's time is just an 
independent variable t appearing in the equations of motion, in 
such a way that if t is changed into t, the equations remain 
the same. It follows that, if all velocities are reversed, the 
system just goes back the same way ; it is completely reversible. 

Newton's time variable is obviously an idealization abstracted 
from simple mechanical models and astronomical observations, 
fitting well into celestial motion, but not into ordinary experi- 
ence. To us it appears that life on earth is going definitely in one 
direction, from past to future, from birth to death, and the 
perception of time in our mind is that of an irresistible and 
irreversible current. 

Another feature of Newton's dynamics was repugnant to 
many of his contemporaries, in particular the followers of 
Descartes, whose cosmology, whatever else its shortcomings, 
satisfied the principle of contiguity, as I have called the condition 
that cause and effect should be in spatial contact. Newton's 
forces, the quantitative expressions for causes of motion, are sup- 
posed to act through empty space, so that cause and effect are 
simultaneous whatever the distance. Newton himself refrained 
from entering into a metaphysical controversy and insisted that 
the facts led unambiguously to his results. Indeed, the language 
of facts was so strong that they silenced the philosophical objec- 
tions, and only when new facts revealed to a later generation the 
propagation of forces with finite velocity, was the problem of 
contiguity in gravitation taken up. In spite of these difficulties, 
Newton's dynamics has served many generations of physicists 
and is useful, even indispensable, to-day. 



ALTHOUGH I maintain that neither causality itself nor its 
attributes, which I called the principles of antecedence and 
contiguity, are metaphysical, and that only the inference by 
induction transcends experience, there is no doubt that these 
ideas have a strong power over the human mind, and we have 
evidence enough that they have influenced the development of 
classical physics. Much effort has been made to reconcile 
Newton's laws with these postulates. Contiguity is closely bound 
up with the introduction of contact forces, pressures, tensions, 
first in ordinary material bodies, then in the electromagnetic 
ether, and thus to the idea of fields of forces ; but the systematic 
application of contiguity to gravitation exploded Newton's 
theory, which was superseded by Einstein's relativity. Similar 
was the fate of the postulate of antecedence ; it is closely bound 
up with irreversibility in time, and found its first quantitative 
formulation in thermodynamics. The reconciliation of it with 
Newton's laws was attempted by atomistics and physical 
statistics ; the idea being that accumulations of immense num- 
bers of invisible Newtonian particles, atoms, or molecules appear 
to the observer to have the feature of irreversibility for statistical 
reasons. The atoms were first hypothetical, but soon they were 
taken seriously, and one began to search for them, with increas- 
ing success. They became more and more real, and finally even 
visible. And tlaen it turned out that they were no Newtonian 
particles at all. Whereupon the whole classical physics exploded, 
to be replaced by quantum theory. Looked at from the point 
of view of our principles, the situation in quantum theory is 
reversed. Determinism (which is so prominent a characteristic 
of Newton's theory) is abandoned, but contiguity and ante- 
cedence (violated by Newton's laws) are preserved to a consider- 
able degree. Causality, which in my formulation is independent 
of antecedence and contiguity, is not affected by these changes : 



scientific work will always be the search for causal interdepen- 
dence of phenomena. a 

After this summary of the following discussion, let us return to 
the question why violation of the principles of contiguity and ante- 
cedence in Newton's theory was first accepted though not with- 
out protest but later amended and finally rejected. This change 
was due to the transition from celestial to terrestrial mechanics. 

The success of Newton's theory was mainly in the field of 
planetary motion, and there it was overwhelming indeed. It is 
not my purpose to expand on the history of astronomy after 
Newton ; it suffices to remember that the power of analytical 
mechanics to describe and predict accurately the observations 
led many to the conviction that it was the final formulation of 
the ultimate laws of nature. 

The main attention was paid to the mathematical investiga- 
tion of the equations of motion, and the works of Lagrange, 
Laplace, Gauss, Hamilton, and many others are a lasting 
memorial of this epoch. Of all these writings, I shall dwell only 
for a moment on that of Hamilton, because his formulation of 
Newton's laws is the most general and elegant one, and because 
they will be used over and over again in the following lectures. 
So permit me a short mathematical interlude which has nothing 
directly to do with cause and chance. 

Hamilton considers a system of particles described by any 
(in general non- Cartesian) coordinates #1, #2v> then the poten- 
tial energy is a function of these, V(q l9 J 2 > ) r shortly F(#), and 
the kinetic energy T a function of both the coordinates and the 
generalized velocities q v q%,..., T(q, q). He then defines general- 
ized momenta m 

and regards the total energy T+ V as a function of the # a and p a . 
This function T+V = H(q, p) (4.2) 

is to-day called the Hamiltonian. 

The equations of motion assume the simple 'canonical' form 


from which one reads at once the conservation law of energy, 

^ = 0, H = const. (4.4) 


It is this set of formulae which has survived the most violent 
revolution of physical ideas which has ever taken place, the 
transition to quantum mechanics. 

Returning now to the post-Newtonian period there was, 
simultaneous with the astronomical applications and confirma- 
tions of the theory, a lively interest in applying it to ordinary 
terrestrial physics. Even here Newton had shown the way and 
had calculated, for instance, the velocity of sound in a fluid. 
Eventually the mechanics of elastic solids brought about a 
modification of Newton's definition of force which satisfies con- 
tiguity. Much of this work is due to the great mathematician 
Cauchy. He started, as many before him, by treating a solid as 
an aggregate of tiny particles, acting on one another with 
Newtonian non-contiguous forces of short range anticipating 
to some degree the modern atomistic standpoint. But there was 
of course, at that time, no evidence of the physical reality of 
these particles. In the physical applications all traces of them 
were obliterated by averaging. The form of these results sug- 
gested to Cauchy another method of approach where particle 
mechanics is completely discarded. Matter is considered to be a 
real continuum in the mathematical sense, so that it has a 
meaning to speak of a force between two pieces of matter 
separated by a surface. This seems to be, from our modern 
standpoint, a step in the wrong direction, as we know matter 
to be discontinuous. But Cauchy 's work showed how con- 
tiguity could be introduced into mechanics ; the importance of 
this point became evident when the new method was applied to 
the ether, the carrier of light and of electric and magnetic forces, 
which even to-day is still regarded as continuous though it has 
lost most of the characteristic properties of a substance and can 
hardly be called a continuous medium. 

In this theory all laws appear in the form of partial differential 
equations, in which the three space-coordinates appear together 
with the time as independent coordinates. 


I shall give a short sketch of the mechanics of continuous 

Mass, velocity, and all other properties of matter are con- 
sidered continuously distributed in space. The mass per unit 
volume or density p is then a function of the space coordinates, 
and the same holds for the current of mass u = pv (namely the 
quantity of mass passing through a surface per unit area and 
unit time). The conservation (indestructibility) of mass then 
leads to the so-called continuity equation (see Appendix, 3) 

p+div u = 0. (4.5) 

Concerning the forces, one has to assume that, if the sub- 
stance is regarded as separated into two parts by a surface, each 
part exerts a push or pull through this surface on the other which, 
measured per unit area, is called tension or stress. A simple 
mathematical consideration, based on the equilibrium conditions 
for the resultant forces acting on the surfaces of a volume 
element, shows that it suffices to define these tension forces for 
three non-coplanar surface elements, say those parallel to the 
three coordinate planes; the force on the element normal to x 
being T x with components T xx ,T xy ,T xz) the other two forces 
correspondingly T y (T yx , T yy , T yz ) and T s (T zx , T zy) T m ). Then the 
force on a surface element with the normal unit vector 

n ( n x> n y> n z) 

is given by T n = T x n x +T y n y +T e n z . (4.6) 

Application of the law of moments to a small volume element 
shows (see Appendix, 3) that 

m _. 7? m__/p /77__/TT /4/7\ 

*yz -*0y> *zx J -xz'> -^xy L yx t V** ( / 

Hence the quantities T form a symmetrical matrix, the stress 

'xy -xz 

T m T y ,\. (4.8) 

T,,, Tj 

Newton's law applied to a volume element then leads to the 



where div T is a vector with the components 

and d/dt the operator 

which is called the 'convective derivative'. 

(4.9) together with (4.5) are the new equations of motion 
which satisfy the postulate of contiguity. They are the proto- 
type for all subsequent field theories. In the present form they 
are still incomplete and rather void of meaning, as the stress 
tensor is not specified in its dependence on the physical condi- 
tions of the system just in the same way as Newton's equations 
are void of meaning if the forces are not specified with their 
dependence on the configuration of the particles. The configura- 
tion of a continuous system cannot be described by the values 
of a finite number of variables, but by certain space functions, 
called ' strain-components'. They are defined in this way: 
A small (infinitesimal) volume of initially spherical shape will be 
transformed by the deformation into an ellipsoid ; the equation 
of this has the form 

e n x 2 +e 22 y 2 +e 39 z 2 +2e^yz+2e 31 zx+2e u xy = e, (4.12) 

where e is an (infinitesimal) constant, measuring the absolute 
dimensions, and e n , e 22 ,..., e 12 are six quantities depending on 
the position x, y, z of the centre of the sphere. These e^ are the 
components of the strain tensor e. 

In the thegry of elasticity it is assumed that the stress com- 
ponents Ty are linear functions of the strain components e^ 
(Hooke's law). 

In hydrodynamics the relation between T and e involves 
space- and time-derivatives of e^. In plastic solids the situation 
is still more complicated. 

We need not enter into these different branches of the 
mechanics in continuous media. The only important point for 
us is this : Contact forces spread not instantaneously but with 
finite velocity. This is the main feature distinguishing Cauchy's' 


contiguous mechanics from Newton's non-contiguous. The 
simplest example is an elastic fluid (liquid or' gas). Here the 
stress tensor T has only diagonal elements which are equal and 
represent the pressure p. The configuration can also be described 
by one variable, the density p or, for a given mass, the volume V. 
The relation between p and F may be any function # =/(F) we 
shall have to remember this later when we have to deal with 
thermodynamics. For small disturbances of equilibrium the 
general equations reduce to linear ones ; any quantity </> in an 
isotropic fluid (change of volume or pressure) satisfies the linear 
wave equation , Q2 ^ 

ig-* (4.1.) 

where A is Laplace's differential operator 

' ++*;. (4.14) 

and c a constant which is easily found to mean the phase 
velocity of a plane harmonic wave 


<f> == A sin (# ct). 

The equation (4.13) links up mechanics with other branches 
of physics which have independently developed, optics and 


The history of optics, in particular Newton's contributions 
and his dispute with Huygens about the corpuscular or wave 
nature of light, is so well known that I need not speak about it. 
A hundred years after Newton, the wave nature of light was 
established by Young and Fresnel with the help of experiments 
on diffraction and interference. Wave equations of the type 
(4.13) were used as a matter of course to describe the observa- 
tions, where now <f> means the amplitude of the vibration. 

But what is it that vibrates ? A name, 'ether/ was ready to 
hand, and its ability to propagate transverse waves suggested 
that it was comparable to an elastic solid. In this way it came 
to pass that the ether-filled vacuum was the carrier of contact 


forces, spreading with finite velocity. They existed for a long 
period peacefully beside Newton's instantaneous forces of 
gravitation, and other similar forces introduced to describe 
elementary experiences in electricity and magnetism. These 
forces are usually connected with Coulomb's name, who verified 
them by direct measurements of the intensity of attraction and 
repulsion between small charged bodies, and between the 
poles of needle-shaped magnets. He found a law of the same 
type as that of Newton, of the form //r~ 2 , where the constant p 
depends on the state of electrification or magnetization respec- 
tively of the interacting particles; by applying the law of 
action and reaction p can be split into factors, /x = e l e 2 in the 
electric case, where e v e z are called the charges. It must, 
however, have been remarked that this law was already 
established earlier and with a higher degree of accuracy by 
Cavendish and Priestley by an indirect reasoning, with the help 
of the fact that a closed conductor screens a charged particle 
from the influence of outside charges; this argument, though still 
dressed up in the language of Newtonian forces, is already quite 
close to notions of field theory. 

It was the attempt to formulate the mechanical interactions j 
between linear currents (in thin wires) in terms of Newtonian f 
forces which entangled physics in the first part of the nineteenth 
century in serious difficulties. Meanwhile Faraday had begun / 
his investigations unbiased by any mathematical theory, and 
accumulated direct evidence for understanding electric and 
magnetic phenomena with the help of contact forces. He spoke 
about pressures and tensions in the media surrounding charged 
bodies, using the expressions introduced in the theory of elas- 
ticity, yet with considerable and somewhat strange modifica- 
tions. Indeed, the strangeness of these assumptions made it 
difficult for his learned contemporaries to accept his ideas and 
to discard the well-established Newtonian fashion of descrip- 
tion. Yet seen from our modern standpoint, there is no intrinsic 
difference between the two methods, as long as only static and 
stationary phenomena are considered. Mathematical analysis 
shows that the resultant forces on observable bodies can be 


expressed either as integrals over elementary contributions of 
the Newtonian or better Coulombian type acting over the 
distance, or by surface integrals of tensions derived from field 
equations. This holds not only for conductors in vacuo, but also 
for dielectric and magnetizable substances ; it is true that in the 
latter case the Coulombian forces lead to integral equations 
which are somewhat involved, but the differential equations of 
the field are, in spite of their simpler aspect, not intrinsically 
simpler. This is often overlooked in modern text-books. How- 
ever, in Faraday's time this equivalence of differential and 
integral equations for the forces was not known, and if it had 
been, Faraday would not have cared. His conviction of the 
superiority of contact forces over Coulombian forces rested on 
his physical intuition. It needed another, more mathematically- 
minded genius, Clerk Maxwell, to find the clue which made it 
impossible to accept forces acting instantaneously over finite 
distances: the finite velocity of propagation. It is not easy to 
analyse exactly the epistemological and experimental founda- 
tions of Maxwell's prediction, as his first papers make use of 
rather weird models and the purity of his thought appears only 
in his later publications. I think the process which led to Max- 
well's equations, stripped of all unnecessary verbiage and round- 
about ways, was this : By combining all the known experimental 
facts about charges, magnetic poles, currents, and the forces 
between them, he could establish a set of field equations con- 
necting the spatial and temporal changes of the electric and 
magnetic field strength (force per unit charge) with the electric 
charge density and current. But if these were combined with 
the condition that any change of charge could occur only by 
means of a current (expressed by a continuity equation analo- 
gous to (4.5) ; see Appendix, 4), an inadequacy became obvious. 
In the language of that time, the result was formulated by 
saying that no open currents (like discharge of condensers) 
<3ould be described by this theory. Therefore something was 
wrong in the equations, and an inspection showed a suspicious 
feature, a lack of symmetry. The terms expressing Faraday's 
induction law (production of electric force by the time variation 


of the magnetic field) had no counterpart obtained by exchanging 
the symbols for electric and magnetic quantities (production of 
magnetic force by the time variation of the electric field). With- 
out any direct experimental evidence Maxwell postulated this 
inverse effect and added to his equations the corresponding 
term, which expresses that a change of the electric field (dis- 
placement current) is, in its magnetic action, equivalent to an 
ordinary current. It was a guess based on a belief in harmony. 
Yet by some mathematical reasoning it can be connected with 
one single but highly significant fact which sufficed to convince 
Maxwell of the correctness of his conjecture just as Newton 
was convinced of the correctness of his law of gravitation by one 
single numerical coincidence, the calculation of terrestrial; 
gravity from the moon's orbit. Maxwell showed that his modified 
equations had solutions representing waves, the velocity, c, of 
which could be expressed in terms of purely electric and magnetic 
constants ; for the vacuum c turned out to be equal to the ratio 
of a unit of charge measured electrostatically (by Coulomb's 
law) and electromagnetically (by Oersted's law). This ratio, a 
quantity of the dimensions of a velocity, was known from 
measurements by Kohlrausch and Weber, and its numerical 
value coincided with the velocity of light. That could hardly 
be accidental, indeed, and Maxwell could pronounce the electro- 
magnetic theory of light. 

The final confirmation of Maxwell's theory was, after his 
death, obtained by Hertz's discovery of electromagnetic waves, 

I cannot follow the further course of events in the establish- 
ment of electromagnetic theory. I only wish to stress the point 
that the use of contact forces and field equations, i.e. the 
establishment of contiguity, in electromagnetism was the result 
of a long struggle against preconceptions of Newtonian origin. 
This confirms my view that the question of contiguity is not a 
metaphysical one, but an empirical one. 

We have now to see whether the laws of electromagnetism 
satisfy the principle of antecedence. An inspection of Maxwell's 
equations (see Appendix, 4) shows that a reversal of time, 
-> t, leaves everything, including the continuity equation, 


unchanged, if the electric density and field are kept unchanged 
while the electric current and magnetic field are reversed. This 
is a kind of reversibility very similar to that of mechanics, 
where a change of the sign of all velocities makes the system 
return to its initial state. The difference is only a practical one : 
a change of sign of all current densities and the whole magnetic 
field is not as simple to perform as that of a finite set of velocities. 
The situation is best seen by considering an electromagnetic 
wave spreading from a point source ; the corresponding solution 
of Maxwell's equation is given by so-called retarded potentials 
which express the electromagnetic state at a point P for the time 
t in terms of the motion of the source at the time tr/c, where r 
is the distance of P from the source. But there also exist other 
solutions, advanced potentials, which refer to the later time 
t+r/c and represent a wave contracting towards the source. 

Such contracting waves are of course necessary for solving 
certain problems. Imagine, for instance, a spherical wave 
reflected by a concentric spherical mirror. However, such a 
mirror must be absolutely perfect to do its duty, and there 
appears to be something improbable about the occurrence of 
advanced potentials in nature. For the description of ele- 
mentary processes of emission of atoms or electrons one has 
supplemented Maxwell's equations by the rule that only retarded 
solutions are allowed. In this way a kind of irreversibility can 
|be introduced and the principle of antecedence satisfied. But 
'this is altogether artificial and unsatisfactory. The irreversi- 
bility of actual electromagnetic processes has its roots in other 
fectsL which we shall later have to describe in detail. Maxwell's 
equations themselves do not satisfy the postulate of antecedence. 

The situation which we have now reached is that which I 
found when I began to study almost half a century ago. There 
existed, more or less peacefully side by side, Newton's mechanics 
of instantaneous action over any distance, Cauchy's mechanics 
of continuous substances, and Maxwell's electrodynamics, the 
latter two satisfying the postulate of contiguity . Of these theories, 


Maxwell's seemed to be the most promising and fertile, and the 
idea began to spread that possibly all forces of nature might be of 
electromagnetic origin. The problem had to be envisaged, how to 
reconcile Newton's gravitational forces with the postulate of con- 
tiguity ; the solution was Einstein's general theory of relativity. 

This is a long and interesting story by itself which involves 
not only the notion of cause with which we are concerned here, 
but other philosophical concepts, namely those concerned with 
space and time. A detailed discussion of these problems would 
lead us too far away from our subject, and I think it hardly 
necessary to dwell on them because relativity is to-day widely 
known and part of the syllabus of the student of mathematics 
and physics as well. So I shall give a very short outline only. 

The physical problems which led to the theory of relativity 
were those concerned with the optical and electromagnetic 
phenomena of fast-moving bodies. There are two types of 
experiments : those using the high velocity of celestial bodies 
(e.g. Michelson's and Morley's experiment) and those using fast 
electrons or ions (e.g. Bucherer's measurement of the mass of 
electrons in cathode rays as a function of the velocity) . The work 
of Lorentz, FitzGerald, Poincar6, and others prepared the ground 
for Einstein's discovery that the root of all difficulties was the 
assumption of a universal time valid for all moving systems of 
reference. He showed that this assumption has no foundation 
in any possible experience and he replaced it by a simple defini- 
tion of relative time, valid in a given coordinate system, but 
different from the time of another system in relative motion. 
The formal lajv of transformation from one space-time system 
to another was already known, owing to an analysis of Lorentz ; 
it is in fact an intrinsic property of Maxwell's equations. The, 
Lorentz transformation is linear; it expresses the physical 
equivalence of systems in relative motion with constant velocity 
(see Appendix, 5). 

Einstein's theory of gravitation is formally based on a general- 
ization of these transformations into arbitrary, non-linear ones ; 
with the help of these one can express the transition from one 
system of reference to another accelerated (and simultaneously 


deformed) one. The physical idea behind this mathematical 
formalism has been already mentioned: the exact proportion- 
ality of mass, as defined by inertia, and of mass as defined by 
gravitation, equation (3.10); or, in other words, the fact that 
in Newton's law of gravitational motion (3.4) the (inertial) mass 
does not appear. 

Einstein succeeded in establishing equations for the gravita- 
tional field by identifying the components of this field with the 
quantities g^ v which define the geometry of space -time, namely 
the coefficients of the line element 



where x l 9 # 2 , x 3 stand for the space coordinates x, y, z, x* for the 
time t. 

In ordinary 3-dimensional Euclidean geometry the g^ v are 
constant and can, by a proper choice of the coordinate system, 
be normalized in such a way that 

9w= *> ^ v = 0for/x ^v. 

Minkowski showed that special relativity can be regarded as 
a 4-dimensional geometry, where time is added as the fourth 
coordinate, but still with constant g^ v , which can be normalized to 

11 = 022 = 033= l > 044=- 1 > fy, = for /* =5* V. (4.16) 

It was further known from the work of Riemann that a very 
general type of non-Euclidean geometry in 3-dimensional space 
could be obtained by taking the gr as variable functions of 
x l , x z , # 3 , and the mathematical properties of this geometry had 
been thoroughly studied (Levi-Civita, Eicci). 

Einstein generalized Eiemann's formalism to four dimensions, 
assuming that the g^ v depend not only on x l ,x 2 ,x* 9 but also on 
x 4 , the time. However, he regarded the g^ v not as given functions 
of x l , # 2 , # 3 , # 4 but as field quantities to be calculated from the 
distribution of matter. He formed a set of quantities R^ v which 
can be regarded as a measure of the * curvature' of space and are 
functions of the g^ v and their first and second derivatives, and 
postulated equations of the form 

R^ - fcZ;,, (4.17) 


where K is a constant and the T^ v are generalizations of the 
tensions in matter, defined in (4.8): one has to supplement the 
tensor T by a fourth row and column, where T U ,T 2 ^T U are 
the components of the density of momentum, T u the density of 
energy. These equations (4.17) are invariant in a very general 
sense, namely for all continuous transformations of space-time, 
and they are essentially uniquely determined by this property 
and the postulate that no higher derivatives than the second- 
order ones should appear. 

If the distribution of matter is given, i.e. the T^ v are known, 
the field equations (4.17) allow one to calculate the g v , i.e. the 
geometry of space. Einstein found the solution for a mass point 
as source of the field, and by assuming that the motion of another 
particle was determined by a geodesic, or shortest, or straightest 
line in this geometry, he showed that Newton's laws of planetary 
motion follow as a first approximation. But higher approxima- 
tions lead to small deviations, some of which can be observed. 
I cannot enter into the discussion of all the consequences of the 
new gravitational theory ; Einstein's predictions have been con- 
firmed, although some of them are at the limit of observational 
technique. But I wish to add a remark about a theoretical point 
which is not so well known, yet very important. The assump- 
tion that the motion of a particle is given by a geodesic is 
obviously an unsatisfactory feature ; one would expect that the 
field equations alone should determine not only the field pro- 
duced by particles but also the reaction of the particles to the 
field, that is their motion. Einstein, with his collaborators 
Infeld and Hoffmann, has proved that this is in fact the case, 
and the same result has been obtained independently and, as I 
think, in a considerably simpler way, by the Russian physicist,. 
Pock. On the basis of these admirable papers, one can say that! 
the field theory of gravitatien is logically perfect whether it? 
will stand all observational tests remains to be seen. * 

From the standpoint of the philosophical problem, which is 
the subject of these lectures, there are several conclusions to be 
drawn. The first is that now physical geometry, that is, not some 
abstract mathematical system but the geometrical aspect of the 


behaviour of actual bodies, is subject to the cause-effect relation 
and to all related principles like any other btanch of science. 
The mathematicians often stress the opposite point of view ; 
they speak of the geometrization of physics, but though it cannot 
be denied that the mathematical beauty of this method has 
inspired numerous valuable investigations, it seems to me an 
over-estimation of the formalism. The main point is that 
Einstein's geometrical mechanics or mechanical geometry 
satisfies the principle of contiguity. On the other hand, ante- 
cedence, applied to two consecutive configurations as cause and 
effect, is not satisfied, or not more than in electrodynamics ; for 
there is no intrinsic direction in the flow of time contained in 
the equations. The theory is deterministic, at least in principle : 
the future or past motion of particles and the distribution of the 
gravitational field are predictable from the equations, if the 
situation at a given time is known, together with boundary 
conditions (vanishing of field at infinity) for all times. But as 
the gravitational field travels between the particles with finite 
velocity, this statement is not identical with Newtonian deter- 
minism : a knowledge is needed, not only of all particles, but also 
of all gravitational waves (which do not exist in Newton's 
theory). Einstein himself values the deterministic feature of his 
theory very highly. He regards it as a postulate which has to 
be demanded from any physical theory, and he rejects, there- 
fore, parts of modern physics which do not satisfy it. 

Here I only wish to remark that determinism in field theories 
seems to me of very little significance. To illustrate the power 
of mechanics, Laplace invented a super-mathematician able to 
predict the future of the world provided the positions and veloci- 
ties of all particles at one moment were given to him. I can 
sympathize with him in his arduous task. But I would really 
pity him if he had not only to solve the numerous ordinary 
differential equations of Newtonian type but also the partial 
differential equations of the field theory with the particles as 


WE have now to discuss the experiences which make it possible 
to distinguish in an objective empirical way between past and 
future or, in our terminology, to establish the principle of 
antecedence in the chain of cause and effect. These experiences 
are connected with the production and transfer of heat. There 
would be a long story to tell about the preliminary steps 
necessary to translate the subjective phenomena of hot and cold 
into the objective language of physics : the distinction between 
the quality 'temperature' and the quantity 'heat', together 
with the invention of the corresponding instruments, the ther- 
mometer and calorimeter. I take the technical side of this 
development to be well known and I shall use the thermal 
concepts in the usual way, although I shall have to analyse them 
presently from the standpoint of scientific methodology. It was 
only natural that the measurable quantity heat was first 
regarded as a kind of invisible substance called caloric. The 
flow of heat was treated with the methods developed for material 
liquids, yet with one important difference : the inertia of the 
caloric fluid seemed to be negligible; its flow was determined 
by a differential equation which is not of the second but of the 
first order in time. It is obtained from the continuity equation 
(see (4.5)) (J+divq = (5.1) 

by assuming that the change of the density of heat Q is propor- 
tional to the change of temperature T, SQ = c 8T (where c is the 
specific heat), while the current of heat q is proportional to the 
negative gradient of temperature, q = K grad T (where K is 
the coefficient of conductivity). Hence 


c^- = K *T, (5.2) 


a differential equation of the first order in time. This equa- 
tion was the starting-point of one of the greatest discoveries 
in mathematics, Fourier's theory of expansion of arbitrary 


functions in terms of orthogonal sets of simple periodic functions, 
the prototype of numerous similar expansions and the embryo 
from which a considerable part of modern analysis and mathe- 
matical physics developed. 

But that is not the aspect from which we have here to regard 
the equation (5.2) ; it is this: 

The equation does not allow a change of t into t, the result 
cannot be compensated by a change of sign of other variables 
as happens in Maxwell's equations. Hence the solutions exhibit 
an essential difference of past and future, a definite 'flow of time' 
as one is used to say meaning, of course, a flow of events in 
time. For instance, an elementary solution of (5.2) for the 
temperature distribution in a thin wire along the ^-direction is 

, (5.3) 

which describes the spreading and levelling out of an initially 
high temperature concentrated near the point x = 0, an 
obviously irreversible phenomenon. 

I do not know enough of the history of physics to understand 
how this theory of heat conduction was reconciled with the 
general conviction that the ultimate laws of physics were of the 
Newtonian reversible type. 

Before a solution of this problem could be attempted another 
important step was necessary : the discovery of the equivalence 
of heat and mechanical work, or, as we say to-day, of the first 
law of thermodynamics. It is important to remember that this 
discovery was made considerably later than the invention of the 
steam-engine. Not only the production of heat by mechanical 
work (e.g. through friction), but also the production of work 
from heat (steam-engine) was known. The new feature was the 
statement that a given amount of heat always corresponds to a 
definite amount of mechanical work, its 'mechanical equivalent'. 
Robert Mayer pronounced this law on very scanty and indirect 
evidence, but obtained a fairly good value for the equivalent 
from known properties of gases, namely from the difference of 
heat necessary to raise the temperature by one degree if either 


the volume is kept constant or the gas allowed to do work 
against a constant pressure. Joule investigated the same 
problem by systematic experiments which proved the essential 
point, namely that the work necessary to transfer a system from 
one equilibrium state to another depends only on these two 
states, not on the process of application of the work. This is the 
real content of the first law ; the determination of the numerical 
value of the mechanical equivalent, so much stressed in text- 
books, is a matter of physical technique. To get our notions 
clear, we have now to return to the logical and philosophical 
foundations of the theory of heat. 

The problem is to transform the subjective sense impressions 
of hot and cold into objective measurable statements. The latter 
are, of course, again somewhere connected with sense impres- 
sions. You cannot read an instrument without looking at it. 
But there is a difference between this looking at, say, a thermo- 
meter with which a nurse measures the temperature of a patient 
and the feeling of being hot under which the patient suffers. 

It is a general principle of science to rid itself as much as 
possible from sense qualities. This is often misunderstood as 
meaning elimination of sense impressions, which, of course, is 
absurd. Science is based on observation, hence on the use of the 
senses. The problem is to eliminate the subjective features and 
to maintain only statements which can be confirmed by several 
individuals in an objective way. It is impossible to explain to 
anybody what I mean by saying 'This thing is red' or 'This 
thing is hot'. The most I can do is to find out whether other 
persons call the same things red or hot. Science aims at a closer 
relation between word and fact. Its method consists in finding 
correlations of one kind of subjective sense impressions with 
other kinds, using the one as indicators for the other, and in this 
way establishes what is called a fact of observation. 

Here I have ventured again into metaphysics. At least, a 
philosopher would claim that a thorough study of these 
methodological principles is beyond physics. I think it is again 
a rule of our craft as scientists, like the principle of inductive 
inference, and I shall not analyse it further at this moment. 



In the case of thermal phenomena, the problem is to define 
the quantities involved temperature, heat by means of 
observable objective changes in material bodies. It turns out 
that the concepts of mechanics, configuration and force, strain 
and stress, suffice for this purpose, but that the laws of mechanics 
have to be essentially changed. 

Let us consider for simplicity only systems of fluids, that is of 
continuous media, whose state in equilibrium is defined by one 
single strain quantity, the density, instead of which we can also, 
for a given mass, take the total volume V. There is also only one 
stress quantity, the pressure p. From the standpoint of 
mechanics the pressure in equilibrium is a given function of the 
volume, p = f(V). 

Now all those experiences which are connected with the 
subjective impression of making the fluid hotter or colder, show 
that this law of mechanics is wrong : the pressure can be changed 
at constant volume namely 'by heating' or 'by cooling'. 

Hence the pressure p can be regarded as an independent 
variable besides the volume V, and this is exactly what thermo- 
dynamics does. 

The generalization for more complicated substances (such as 
those with rigidity or magnetic polarizability) is so obvious that 
I shall stick to the examples of fluids, characterized by two 
thermodynamically independent variables V,p. But it is 
necessary to consider systems consisting of several fluids, and 
therefore one has to say a word about different kinds of contact 
between them. 

To shorten the expression, one introduces the idea of 'walls' 
separating different fluids. These walls are supposed to be so 
thin that they play no other part in the physical behaviour of 
the system than to define the interaction between two neigh- 
bouring fluids. We shall assume every wall to be impenetrable 
to matter, although in theoretical chemistry semi-permeable 
partitions are used with great advantage. Two kinds of walls 
are to be considered. 

An adiabatic wall is defined by the property that equilibrium 
of a body enclosed by it is not disturbed by any external process 


as long as no part of the wall is moved (distance forces being 
excluded in the whole consideration). 

Two comments have to be made. The first is that the 
adiabatic property is here defined without using the notion of 
heat ; that is essential, for as it is our aim to define the thermal 
concepts in mechanical terms, we cannot use them in the ele- 
mentary definitions. The second remark is that adiabatic 
enclosure of a system can be practically realized, as in the Dewar 
vessel or thermos flask, with a high degree of approximation. 
Without this fact, thermodynamics would be utterly im- 

The ordinary presentations of this subject, though rather 
careless in their definitions, cannot avoid the assumption of the 
possibility of isolating a system thermally; without this no 
calorimeter would work and heat could not be measured. 

The second type of wall is the diathermanous wall, defined by 
the following property : if two bodies are separated by a diather- 
manous wall, they are not in equilibrium for arbitrary values of 
their variables p^V^ and p 2 ,V 2 , but only if a definite relation 
between these four quantities is satisfied 

F(Pi ,Vi,PVt) = 0. (5.4) 

This is the expression of thermal contact; the wall is only intro- 
duced to symbolize the impossibility of exchange of material. 

The concept of temperature is based on the experience that 
two bodies, being in thermal equilibrium with a third one, are 
also in thermal equilibrium with another. If we write (5.4) in 
the short form JF(1, 2) = 0, this property of equilibrium can be 
expressed by saying that of the three equations 

F(2, 3) = 0, J(3, 1) = 0, JP(1, 2) = 0, (5.5) 

any two always involve the third. This is only possible if 
(5.4) can be brought into the form 

Now one can use one of the two bodies, say 2, as thermometer 
and introduce the value of the function 

I) = * (5.7) 


as empirical temperature. Then one has for the other body the 
so-called equation of state 

Any arbitrary function of & can be chosen as empirical tem- 
perature with equal right; the choice is restricted only by 
practical considerations. (It would be impractical to use a ther- 
mometric substance for which two distinguishable states are in 
thermal equilibrium.) The curves & = const, in thep F-plane are 
independent of the temperature scale ; they are called isotherms. 

It is not superfluous to stress the extreme arbitrariness of the 
temperature scale. Any suitable property of any substance can 
be chosen as thermometric indicator, and if this is done, still 
the scale remains at our disposal. If we, for example, choose a 
gas at low pressures, because of the simplicity of the isothermal 
compression law p V = const., there is no reason to take p V = # 
as measure of temperature: one could just as well take (pV) 2 or 
*J(p F). The definition of an ' absolute ' scale of temperature was 
therefore an urgent problem which was solved by the discovery 
of the second law of thermodynamics. 

The second fundamental concept of thermodynamics, that of 
heat, can be defined in terms of mechanical quantities by a 
proper interpretation of Joule's experiments. As I have pointed 
out already, the gist of these experiments lies in the following 
fact : If a body in an adiabatic enclosure is brought from one 
(equilibrium) state to another by applying external work, the 
amount of this work is always the same in whatever form 
(mechanical, electrical, etc.) and manner (slow or fast, etc.) it is 

Hence for a given initial state (p Q) V ) the work done adia- 
batically is a function U of the final state (p, F), and one can 

write W=U-U ; (5.9) 

the function U(p, V) is called the energy of the system. It is a 
quantity directly measurable by mechanical methods. 

If we now consider a non-adiabatic process leading from the 
initial state (p Q , V Q ) to the final state (p, F), the difference 
JJ-. ^_ jy w iU not be zero, but cian be determined if the energy 


function U(p, V) is known from previous experiment. This 
difference n __ ^_ w = Q (g JQ) 

is called the heat supplied to the system during the process. 
Equation (5.10) is the definition of heat in terms of mechanical 

This procedure presupposes that mechanical work is measur- 
able however it is applied ; that means, for example, that the 
displacements of and the forces on the surface of a stirring- 
wheel in a fluid, or the current and resistance of a wire heating 
the fluid, must be registered even for the most violent reactions. 
Practically this is difficult, and one uses either stationary pro- 
cesses of a comparatively long duration where the irregular 
initial and final stages can be neglected (this includes heating 
by a stationary current), or extremely slow, 'quasi-static* 
processes ; these are in general (practically) reversible, since no 
kinetic energy is produced which could be irreversibly destroyed 
by friction. In ordinary thermodynamics one regards every 
curve in the p F-plane as the diagram of a reversible process ; 
that means that one allows infinitely slow heating or cooling by 
bringing the system into thermal contact with a series of large 
heat reservoirs which differ by small amounts of temperature. 
Such an assumption is artificial; it does not even remotely 
correspond to a real experiment. It is also quite superfluous. 
We can restrict ourselves to adiabatic quasi -static processes, 
consisting of slow movements of the (adiabatic) walls. For these 
the work done on a simple fluid is 

dW = -pdV, (5.11) 

where p is the equilibrium pressure, and the first theorem of 
thermodynamics (5.10) assumes the form 

dQ = dU+pdV = 0. (5.12) 

For systems of fluids separated by adiabatic or diathermanous 
walls the energy and the work done are additive (according to 
our definition of the walls) ; hence, for instance, 

dQ = dQi+dQt = dU+pidTi+ptdTi, (5.13) 

where U = Z7i+Z7 2 . 


This equation is of course only of interest for the case of 
thermal contact where the equation (5.6) holds ; the system has 
then only three independent variables, for which one can choose 
V l9 V 2 and the temperature #, defined by (5.7) and (5.8). Then 
, #), U 2 = t/ 2 (F 2 , #), and (5.13) takes the form 


Every adiabatic quasi-static process can be represented as a 
line in the three-dimensional I^tf-space which satisfies this 
equation; let us call these for brevity * adiabatic lines'. 

Equation (5.14) is a differential equation of a type studied by 
Pfaif. Pfaffian equations are the mathematical expression of 
elementary thermal experiences, and one would expect that the 
laws of thermodynamics are connected with their properties. 
That is indeed the case, as Carath(5odory has shown. But 
classical thermodynamics proceeded in quite a different way, 
introducing the conception of idealized thermal machines which 
transform heat into work and vice versa (William Thomson 
Lord Kelvin), or which pump heat from one reservoir into 
another (Clausius). The second law of thermodynamics is then 
derived from the assumption that not all processes of this kind 
are possible : you cannot transform heat completely into work, 
nor bring it from a state of lower temperature to one of higher 
'without compensation' (see Appendix, 6). These are new and 
strange conceptions, obviously borrowed from engineering. 
I have mentioned that the steam-engine existed before thermo- 
dynamics; it was a matter of course at that time to use the 
notions and experiences of the engineer to obtain the laws of 
heat transformation, and the establishment of the abstract 
concepts of entropy and absolute temperature by this method 
is a wonderful achievement. It would be ridiculous to feel any- 
thing but admiration for the men who invented these methods. 
But even as a student, I thought that they deviated too much 
from the ordinary methods of physics ; I discussed the problem 
with my mathematical friend, Caratheodory, with the result 
that he analysed it and produced a much more satisfactory 


solution. This was about forty years ago, but still all text- 
books reproduce the * classical' method, and I am almost certain 
that the same holds for the great majority of lectures I know, 
however, a few exceptions, namely those of the late R. H. Fowler 
and his school. This state of affairs seems to me one of unhealthy 
conservatism. I take in these lectures an opportunity to advo- 
cate a change. 

The central point of Caratheodory's method is this. The 
principles from which Kelvin and Clausius derived the second 
law are formulated in such a way as to cover the greatest 
possible range of processes incapable of execution: in no way 
whatever can heat be completely transformed into work or 
raised to a higher level of temperature. Caratheodoiy remarked 
that it is perfectly sufficient to know the existence of some 
impossible processes to derive the second law. I need hardly 
say that this is a logical advantage. Moreover, the impossible 
processes are already obtained by scrutinizing Joule's experi- 
ments a little more carefully. They consisted in bringing a 
system in an adiabatic enclosure from one equilibrium state to 
another by doing external work : it is an elementary experience, 
almost obvious, that you cannot get your work back by reversing 
the process. And that holds however near the two states are. 
One can therefore say that there exist adiabatically inaccessible 
states in any vicinity of a given state. That is Caratheodory's 

In particular, there are neighbour states of any given 
one which are inaccessible by quasi-static adiabatic processes. 
These are represented by adiabatic lines satisfying the 
Pfaffian equation (5.14). Therefore the question arises: Does 
Carath^odory's postulate hold for any Pfaffian or does it mean 
a restriction ? 

The latter is the case, and it can be seen by very simple 
mathematics indeed, of which I shall give here a short sketch 
(see Appendix, 7). 

Let us first consider a Pfaffian equation of two variables, 
x and y, 



where X, T are functions of x 9 y. This is equivalent to the 
ordinary differential equation 


which has an infinite number of solutions <l>(x,y) = const., 
representing a one-parameter set of curves in the (#,t/)-plane. 
Along any of these curves one has 

. . 

and this must be the same condition rfs the given Pfaffian; 

hence one must have jr\ \ n /* io\ 

aty = A dtp. (5.18) 

Each Pfaffian dQ of two variables has therefore an 'integrating 
denominator' A, so that dQ/X is a total differential. 
For Pfaffians of three (or more) variables, 

dQ = Xdx+Tdy+Zdz (5.19) 

this does not hold. It is easy to give analytical examples (see 
Appendix, 7) ; but one can see it geometrically in this way: if in 
(5.19) dx > dy, dz are regarded as finite differences #, T? y, 2, 
it is the equation of a plane through the point x,y,z\ one has a 
plane through each point of space, continuously varying in 
orientation with the position of this point. Now if a function <j> 
existed, these planes would have to be tangential to the surfaces 
<f>(x, y, z) = const. But one can construct continuously varying 
sets of planes which are not 'integrable', i.e. tangential to a set 
of surfaces. For example, take all circular screws with the same 
axis, but varying radius and pitch, and construct at each point 
of every screw the normal plane ; these obviously form a non- 
integrable set of planes. 

Hence all Pfaffians can be separated into two classes : those of 
the form dQ = Ad^, which have an 'integrating denominator' 
and represent the tangential planes of a set of surfaces ^ = const., 
and those which lack this property. 

Now in the first case, dQ = Xd$ y any line satisfying the 
Pfaffian equation (5.19) must lie in the surface < = const. 
Hence an arbitrary pair of points P and P in the #t/z-space 


cannot be connected by such a line. This is quite elementary. 
Not quite so obvious is the inverse statement which is used in the 
thermodynamic application: If there are points P in any 
vicinity of a given point P which cannot be connected with P 
by a line satisfying the Pfaffian equation (5.19), then there 
exists an integrating denominator and one has dQ = Xd<f>. 

One can intuitively understand this theorem by a continuity 
consideration : All points P inaccessible from P will fill a certain 
volume, bound by a surface of accessible points going through P . 
Further, to each inaccessible point there corresponds another 
one in the opposite direction ; hence the boundary surface must 
contain all accessible points : which proves the existence of the 
function </>, so that dQ = \d<f> (see Appendix, 7). 

The application of this theorem to thermodynamics is now 
simple. Combining it with Carath^odory's principle, one has for 
any two systems 

dQi = A^, dQt = A 2 cty 2 , (5.20) 
and for the combined system 

dQ = dQt+dQi = Xdfa (5.21) 

hence Xd<f> = ^ &</>!+ X 2 d<f> 2 . (5.22) 

Consider in particular two simple fluids in thermal contact; 
then the system has three independent variables T^, T,#, which 
can be replaced by </> l9 (f) 2y &. Then (5.22) shows that <f> depends 
only on ^ x , < 2 , and not on #, while 

=5 ^ 

d<f>i A d<f>2 A 

Hence these quotients are also independent of #, 

8 h-o _?_^_o 
e A ~ ' a* A ' 

from which one infers 

1 8\ _ I aA _ 1 8X 

Aj 8& ~ A 2 8& ~ X 8&' ^ ' ' 

Now \ is a variable of the first fluid only, therefore only 


dependent on < x and #; in the same way A 2 = A 2 (</> 2 ,#). The 
first equality (5.24) can only hold if both quantities depend only 
on #. Hence 

___ _ _____ _ __ 

where g(&) is a universal function, namely numerically identical 
for different fluids and for the combined system. 

This simple consideration leads with ordinary mathematics 
to the existence of a universal function of temperature. The rest 
is just a matter of normalization. From (5.25) one finds for each 

logA = J g() d#+logd>, A = Oef **> d , (5.26) 

where <I> depends on the corresponding <f>. 
If one now defines 



where the constant C can be fixed by prescribing the value of 
T 1 T 2 for two reproducible states of some normal substance 
(e.g. T l T 2 = 100, if 2i corresponds to the boiling-point, T 2 
the freezing-point of water at 1 atmosphere of pressure), then 
one has dQ = Xd</> = TdS. (5.28) 

T is the thermodynamical or absolute temperature and 8 the 

Equation (5.28) refers only to quasi-static processes, that 
is, to sequences of equilibrium states. To get a result about 
real dynamical phenomena one has to apply Carath6odory's 
principle again, considering a finite transition from an initial 
state FJ, F, 8 to a final state V lt V 2 , 8. One can reach the latter 
one in two steps: first changing the volume quasi-statically 
(and adiabatically ) from FJ, VI to V I9 V 2 , the entropy remaining 
constant, equal to &, and then changing the state adiabatically, 
but irreversibly (by stirring, etc.) at constant volume, so that 
S goes over into S. 

Now if any neighbouring value S of 8 could be reached in 
this way, one would have a contradiction to Carath^odory's 


principle, as the volumes are of course arbitrarily changeable. 
Hence for each such process one must have either 8 ^ $ or 
S < $. Continuity demands that the same sign holds for all 
initial states; it holds also for different substances since the 
entropy is additive (as can be easily seen). The actual sign ^ 
or ^ depends on the choice of the constant C in (5.27); if this is 
chosen so that T is positive, a single experience, say with a gas, 
shows that entropy never decreases. 

It may not be superfluous to add a remark on the behaviour 
of entropy for the case of conduction of heat. As thermo- 
dynamics has to do only with processes where the initial and 
final states are equilibria, stationary flow cannot be treated : one 
can only ask, What is the final state of two initially separated 
bodies brought into thermal contact ? The difficulty is that a 
change of entropy is only defined by quasi-static adiabatic 
processes ; the sudden change of thermal isolation into contact, 
however, is discontinuous and the processes inside the system 
not controllable. Yet one can reduce this process to the one 
considered before. By quasi-static adiabatic changes of volume 
the temperatures can be made equal without change of entropy ; 
then contact can be made without discontinuity, and the initial 
volumes quasi -statically restored, again without a change of 
entropy. The situation is now the same as in the initial state 
considered before, and it follows that any process leading to the 
final state must increase the entropy. 

The whole chain of considerations can be generalized for more 
complicated systems without any difficulty. One has only to 
assume that all independent variables except one are of the type 
represented by the volume, namely arbitrarily changeable. 

If one has to deal, as in chemistry, with substances which are 
mixtures of different components, one can regard the concentra- 
tions of these as arbitrarily variable with the help of semi- 
permeable walls and movable pistons (see Appendix, 8). 

By using thermodynamics a vast amount of knowledge has 
been accumulated not only in physics but in the borderland 
sciences of physico-chemistry, metallurgy, mineralogy, etc. Most 
of it refers to equilibria. In fact, the expression ' thermodynamics ' 


is misleading. The only dynamical statements possible are 
concerned with the irreversible transitions from one equilibrium 
state to another, and they are of a very modest character, giving 
the total increase of entropy or the decrease of free energy 
F = UTS. The irreversible process itself is outside the scope 
of thermodynamics. 

j^s The principle of antecedence is now satisfied ; but this gain is 
paid for by the loss of all details of description which ordinary 
dynamics of continuous media supplies. 

Can this not be mended? Why not apply the methods of 
Cauchy to thermal processes, by treating each volume element 
as a small thermodynamical system, and regarding not only 
strain, stress, and energy, but also temperature and entropy as 
continuous functions in space ? This has of course been done, 
but with limited success. The reason is that thermodynamics is 
definitely connected with walls or enclosures. We have used the 
adiabatic and diathermanous variety, and mentioned semi- 
permeable walls necessary for chemical separations; but a 
volume element is not surrounded by a wall, it is in free contact 
with its neighbourhood. The thermodynamic change to which 
it is subject depends therefore on the flux of energy and material 
constituents through its boundary, which themselves cannot be 
reduced to mechanics. In some limiting cases, one has found 
simple solutions. For instance, when calculating the velocity of 
sound in a gas, one tried first for the relation between pressure 
p and density p the isothermal law p = cp where c is a constant, 
but found no agreement with experiment ; then one took the 
adiabatic law p = cp? where y is the ratio of the specific heats at 
constant pressure and constant volume (see Appendix, 9), which 
gave a much better result. The reason is that for fast vibrations 
there is no time for heat to flow through the boundary of a 
volume element which therefore behaves as if it were adiabati- 
cally enclosed. But by making the vibrations slower and slower, 
one certainly gets into a region where this assumption does not 
hold any more. Then conduction of heat must be taken into 
account. The hydrodynamical equations and those of heat con- 
duction have to be regarded as a simultaneous system. In this 


way a descriptive or phenomenological theory can be developed 
and has been developed. Yet I am unable to give an account of 
it, as I have never studied it; nor have the majority of physicists 
shown much interest in this kind of thing. One knows that any 
flux of matter and energy can be fitted into Cauchy's general 
scheme, and there is not much interest in doing it in the most 
general way. Besides, each effect needs separate constants 
e.g. in liquids compressibility, specific heat, conductivity of heat, 
constants of diffusion ; in solids elastic constants and parameters 
describing plastic flow, etc., and very often these so-called con- 
stants turn out to be not constants, but to depend on other 
quantities (see Appendix, 10). 

Therefore one can rightly say that with ordinary thermo- 
dynamics the descriptive method of physics has come to its 
natural end. Something new had to appear. 



THE new turn in physics was the introduction of atomistics and 

To follow up the history of atomistics into the remote past is 
not in the plan of this lecture. We can take it for granted that 
since the days of Demokritos the hypothesis of matter being 
composed of ultimate and indivisible particles was familiar to 
every educated man. It was revived when the time was ripe. 
Lord Kelvin quotes frequently a Father Boscovich as one of the 
first to use atomistic considerations to solve physical problems ; 
he lived in the eighteenth century, and there may have been 
others, of whom I know nothing, thinking on the same lines. 
The first systematic use of atomistics was made in chemistry, 
where it allowed the reduction of innumerable substances to a 
relatively small stock of elements. Physics followed considerably 
later because atomistics as such was of no great use without 
another fundamental idea, namely that the observable properties 
of matter are not intrinsic qualities of its smallest parts, but 
averages over distributions governed by the laws of chance. 

The theory of probability itself, which expresses these laws, is 
much older; it sprang not from the needs of natural science but from 
gambling and other, more or less disreputable, human activities. 

The first use of probability considerations in science was made 
by Gauss in his theory of experimental errors. I can suppose 
that every scientist knows the outlines of it, yet I have to dwell 
upon it for a few moments because of its fundamental and 
somewhat paradoxical aspect. It has a direct bearing on the 
method of inference by induction which is the backbone of all 
human experience. I have said that in my opinion the signifi- 
cance of this method in science consists in the establishment of a 
code of rules which form the constitution of science itself. Now 
the curious situation arises that this code of rules, which ensures 
the possibility of scientific laws, in particular of the cause-effect 


relation, contains besides many other prescriptions those related 
to observational errors, a branch of the theory of probability. 
This shows that the conception of chance enters into the very 
first steps of scientific activity, in virtue of the fact that no 
observation is absolutely correct. I think chance is a more 
fundamental conception than causality ; for whether in a con- 
crete case a cause-effect relation holds or not can only be judged 
by applying the laws of chance to the observations. 

The history of science reveals a strong tendency to forget this. 
When a scientific theory is firmly established and confirmed, it 
changes its character and becomes a part of the metaphysical 
background of the age: a doctrine is transformed into a dogma. 
In fact no scientific doctrine has more than a probability value 
and is open to modification in the light of new experience. 

After this general remark, let us return to the question how 
the notion of chance and probability entered physics itself. 

As early as 1738 Daniel Bernoulli suggested the interpretation 
of gas pressure as the effect of the impact of numerous particles 
on the wall of the container. The actual development of the 
kinetic theory of gases was, however, accomplished much later, 
in the nineteenth century. 

The object of the theory was to explain the mechanical and 
thermodynamical properties of gas from the average behaviour 
of the molecules. For this purpose a statistical hypothesis was 
made, often called the 'principle of molecular chaos': for an 
* ideal' gas in a closed vessel and in absence of external forces all 
positions and all directions of velocity of the molecules are 
equally probable. 

Applied to a monatomic gas (the atoms are supposed to be 
mass points), this leads at once to a relation between volume F, 
pressure #, and mean energy U (see Appendix, 11) 

Vp = 1*7, (6.1) 

if the pressure p is interpreted as the total momentum trans- 
ferred to the wall by the impact of the molecules. One has now 
only to assume that the energy U is a measure of temperature 
to obtain Boyle's law of the isotherms. Then it follows from 


thermodynamics that U is proportional to the absolute tempera- 
ture (see Appendix, 9) ; one has 

U = IRT, pV = RT, (6.2) 

where R is the ordinary gas constant. This is the complete 
equation of state (combined Boyle-Charles law), and one sees 
that the specific heat of a monatomic gas for constant volume 
is p. 

I have mentioned these things only to stress the point that the 
kinetic theory right from the beginning produced verifiable 
numerical results in abundance. There could be no doubt that 
it was right, but what did it really mean ? 

How is it possible that probability considerations can be 
superimposed upon the deterministic laws of mechanics without 
a clash ? 

These laws connect the state at a time t to the initial state, 
at time , by definite equations. They involve, however, no 
restriction on the initial state. This has to be determined by 
observation in every concrete case. But observations are not 
absolutely accurate; the results of measurements will suffer 
scattering according to Gauss's rules of experimental errors. In 
the case of gas molecules, the situation is extreme ; for owing 
to the smallness and excessive number of the molecules, there is 
almost perfect ignorance of the initial state. 

The only facts known are the geometrical restriction of the 
position of each molecule by the walls of the vessel, and some 
physical quantities of a crude nature, like the resultant pressure 
and the total energy : very little indeed in view of the number of 
molecules (about 10 19 per c.c.). 

Hence it is legitimate to apply probability considerations to 
the initial state, for instance the hypothesis of molecular chaos. 
The statistical behaviour of any future state is then completely 
determined by the laws of mechanics. This is in particular the 
case for 'statistical equilibrium ', when the observable properties 
are independent of time ; in this case any later state must have 
the same statistical properties as the initial state (e.g. it must 
also satisfy the condition of molecular chaos). How can this be 


mathematically formulated ? It is convenient to use the equa- 
tions of motion in Hamilton's canonical form (4.3, p. 18). The 
distribution is described by a function/(, q lt # 2 ,..., j w , Pi,p&.-., p n ) 
of all coordinates and momenta, and of time, such that fdpdq 
is the probability for finding the system at time t in a given 
element dpdq = dp^...dp n dq^...dq n . One can interpret this 
function as the density of a fluid in a 2ft-dimensional ^?g-space, 
called 'phase space'; and, as no particles are supposed to dis- 
appear or to be generated, this fluid must satisfy a continuity 
equation, of the kind (4.5, p. 20), generalized for 2n dimensions, 
namely (see Appendix, 3) 


8 Pk I 

This reduces in virtue of the canonical equations (4.3), p. 18, 

tO f.f 

|-[#,/] = 0, (6.4) 

where [#,/] is an abbreviation, the so-called Poisson bracket, 

namely (66) 

(6 ' 5) 

On the other hand, the convective derivative defined for three 
dimensions in (4.11, p. 21) may be generalized for 2n dimensions 

thus: (6 . 6) 

( ' 

dt 8t 
Then (6.4) says that in virtue of the mechanical equations 

f=0. (6.7) 


The result expressed by the equivalent equations (6.4) and 
(6.7) is called Liouville's theorem. The density function is an 
integral of the canonical equations, i.e. / = const, along any 
trajectory in phase space; in other words, the substance of the 
fluid is carried along by the motion in phase space, so that the 

I=jfdpdq (6.8) 



over any part of the substance moving in phase space is inde- 
pendent of time. 

Any admissible distribution function, namely one for which 
the probabilities of a configuration at different times are com- 
patible with the deterministic laws of mechanics, must be an 
integral of motion, satisfying the partial differential equation 
(6.4). For a closed system, i.e. one which is free from external 
disturbances (like a gas in a solid vessel), H is explicitly inde- 
pendent of time. The special case of statistical equilibrium 
corresponds to certain time-independent solutions of (6.4), i.e. 
functions / satisfying ^^j = (o) 

An obvious integral of this equation is / = O(T), where <I> indi- 
cates an arbitrary function. This case plays a prominent part 
in statistical mechanics. 

Yet before continuing with these very general considerations 
we had better return to the ideal gases and consider the kinetic 
theory in more detail. In an ideal gas, the particles (atoms, 
molecules) are supposed to move independently of one another. 
Hence the function f(p, q) is a product of N functions /(x, J-,) 
each belonging to a single particle and all formally identical; 
x is the position vector and \ = ( l/m)p the velocity vector. Then 
fdxd% is the probability of finding a particle at time t at a specified 
element of volume and velocity. 

In the case where no external forces are present (dH/dt = 0, 
dH/dx = 0) the Hamiltonian reduces to the kinetic energy, 

H = |m| 2 = (l/2m)p 2 . 

The hypothesis of molecular chaos is expressed by assuming /to 
be a function of | 2 alone. This is indeed a solution of (6.9), as it 
can be written in the form / = O(f ) mentioned above. No 
other solution exists if the gas as a whole is homogeneous and 
isotropic (i.e. all positions and directions are physically equiva- 
lent; see Appendix, 12). 

The determination of the velocity distribution function /(| 2 ) 
was recognized by Maxwell as a fundamental problem of kinetic 
theory: it is the quantitative formulation of the 'law of chance' 
for this case. He gave several solutions; his first and simplest 


reasoning was this: Suppose the three components of velocity 
i f 2> 3 t be statistically independent, then 

This functional equation has the only solution (see Appendix, 13) 

where a,j8 are constants. 

This is Maxwell's celebrated law of velocity distribution. 
However, the derivation given is objectionable, as the supposed 
independence of the velocity components is not obvious at all. 
I have mentioned it because the latest proof (and as 1 think the 
most satisfactory and rigorous and of the widest possible genera- 
lization) of the distribution formula uses exactly this Max wellian 
argument, only applied to more suitable variables as we shall 
presently see. 

Maxwell, being aware of this weakness, gave several other 
proofs which have been improved and modified by other authors. 
Eventually it appears that there are two main types of argu- 
ment: the equilibrium proof and the dynamical proof. We shall 
first consider the equilibrium proof in some detail. 

Assume each molecule to be a mechanical system with co- 
ordinates q v g 2 ,... and momenta pi,i? 2 >> f r which we write 
simply q,p, and with a Hamiltonian H(p,q). The interaction 
between the molecules will be neglected. The total number n 
and the total energy U of the assembly of molecules are 

In order to apply the laws of probability it is convenient to 
reduce the continuous set of points p, q in phase space to a dis- 
continuous enumerable set of volume elements. One divides 
the phase space into N small cells of volumes o> x V, a) 2 V,..., a> N V, 
where V is the total volume; hence 

oi 1 +oi 1 +...+co Ar = 1. (6.12) 

To each cell a value of the energy H(p, q) can be attached, say 
that corresponding to its centre; let these energies be 19 2 ,..,, C N . 
Now suppose the particles distributed over the cells so that therof 


are n^ in the first cell, n 2 in the second, etc., but of course with 
the restriction that the totals 

n l +n 2 +...+n N = n, (6.13) 

= U (6.14) 

are fixed. Liouville's theorem suggests that the probability of a 
single molecule being in a given cell is proportional to its volume. 
Making this assumption, one has to calculate the composite 
probability P for any distribution n v %,..., n N under the restric- 
tions (6.13) and (6.14). 

This is an elementary problem of the calculus of probability 
(see Appendix, 14) which can be solved in this way: First the 
second condition (6.14) is omitted; then the probability of a 
given distribution n It n 29 ... 9 n N is 

P( ni ,n z ,...,n N ) = _-_^c^,...y. (6-15) 

If this is summed over all n l9 n 2 ,...,n N satisfying (6.13), one ob- 
tains by the elementary polynomial theorem 

2 P(n l9 n 29 ... 9 n N ) = K+c^+.-.+c^)" = i, (6.16) 


because of (6.12) as it must be if P is a properly normalized 

It is well known that the polynomial coefficients n l/n^ ! n 2 ! . . .n N ! 
have a sharp maximum for n x = n 2 = ... = n N ; that means, if 
all cells have equal volumes (o^ = cu 2 = ... = CO N ) the uniform 
distribution would have an overwhelming probability. Yet this 
is modified by the second condition (6.14) which we have now 
to take into account. The simplest method of doing this proceeds 
in three approximations which seem to be crude, but are perfectly 
satisfactory for very large numbers of particles (n -> oo). The 
first approximation consists in neglecting all distributions of 
comparatively small n lt n 2 ,...,n N ; then the n k can be treated as 
continuous variables. The second approximation consists in 
replacing the exact expression (6.15) by its asymptotic value for 


large n k by using Stirling's formula log(n!) ->n(logn 1) (see 
Appendix, 14), and the result is 

logP = n l logn l n 2 logn 2 ...n N logn N -i~const. 


The third approximation consists in the following assumption: 
the actual behaviour of a gas in statistical equilibrium is deter- 
mined solely by the state of maximum probability; all other 
states have so little chance to appear that they can be neglected. 

Hence one has to determine the maximum of log P given by 
(6.17) under the conditions (6.13) and (6.14). Using elementary 
calculus this leads at once to 

n k = e-to, (6.18) 

where a and j8 are two constants which are necessary in order to 
satisfy the conditions (6.13), (6.14). Yet these constants play a 
rather different part. 

If one has to do with a mixture of two gases A and B with given 
numbers n (A) and n (B \ one gets two conditions of the type (6.13) 
but only one of the type (6.14), expressing that the total energy 
is given. 

Hence one obtains 

n& = e {A} -P#\ 4*> = e* (B) -t<*\ (6.19) 

with two different constants a u) and a (B \ but only one /?. There- 
fore ft is the parameter of thermal equilibrium between the two 
constituents and must depend only on temperature. 

Indeed, if one now calculates the mean energy U and the 
mean pressure p, one can apply thermodynamics and sees 
easily that the second law is satisfied if 

P-W (6-20) 

where T is the absolute temperature and k a constant, called 
Boltzmann's constant. At the same time it appears that the 
entropy is given by 

8 = klogP = -fc2> a logn a . (6.21) 


All these results are mainly due to Boltzmann ; in particular (6.18) 


is called Boltzmann's distribution law. It obviously contains 
Maxwell's law (6.11) as a special case, namely for mass points. 

We have now to ask: Is this consideration which I called the 
equilibrium proof of the distribution law really satisfactory ? 

One objection can be easily dismissed, namely that the 
approximations made are too crude. They can be completely 
avoided. Darwin and Fowler have shown that one can give a 
rigorous expression for the mean value of any physical quantity 
in terms of complex integrals, containing the so-called 'partition 
function' (see Appendix 15) 

a) I z*i+aj 2 z *+...+a> N z*x = F(z). (6.22) 

No distribution is neglected and no use is made of the Stirling 
formula. Yet in the limit n ~> oo, all results are exactly the same 
as given by the Boltzmann distribution function. Although this 
method is extremely elegant and powerful, it does not introduce 
any essential new feature in regard to the fundamental question 
of statistical mechanics. 

Another objection is going deeper: can the molecules of a gas 
really be treated as independent ? 

There are numerous phenomena which show they are not, 
even if one considers only statistical equilibrium. For no real 
gas is 'ideal', i.e. satisfies Boyle's law rigorously, and the devia- 
tions increase with pressure, ending in a complete collapse, con- 
densation. This proves the existence of long-range attractive 
forces between the molecules. The statistical method described 
above is unable to deal with them. The first attempt to correct 
this was the celebrated theory of van der Waals, which was 
followed by many others. I shall later describe in a few words 
the modern version of these theories, which is, from a certain 
standpoint, rigorous and satisfactory. 

More serious are the interactions revealed by non-equilibrium 
phenomena: viscosity, conduction of heat, diffusion. They can 
all be qualitatively understood by supposing that each molecule 
has a finite volume, or more correctly that two molecules have a 
short-range repulsive interaction which prohibits a close 
approach. This assumption has the consequence that there 


exists an effective cross-section for a collision, hence a mean free 
path for the straight motion of a molecule. The coefficients of 
the three phenomena mentioned can be reduced, by elementary 
considerations, to the free path, and the results, as far as they go, 
are in good agreement with observations. 

All this is very good physics producing in a simple and intuitive 
way formulae which give the correct order of magnitude of 
different correlated effects. 

But for the problem of a rigorous kinetic theory, which takes 
account of the interactions and is valid not only for equilibria, 
but also for motion, these considerations have only the value of 
a preliminary reconnoitring. The question is: How can one 
derive the hydrodynamical equations of visible motion together 
with the phenomena of transformation and conduction of heat 
and, for a mixture, of diffusion ? 

This is an ambitious programme. For such a theory must 
include the result that a gas left to itself tends to equilibrium. 
Hence it must lead to irreversibility, although the laws of 
ordinary reversible mechanics are assumed to hold for the mole- 
cules. How is this possible ? Further, is the equilibrium obtained 
in this way the same as that derived directly, say by the method 
of the most probable distribution ? 

To begin with the last question. Its answer represents what I 
have called above the dynamical proof of the distribution law 
for equilibrium. 

The formulation of the non-equilibrium theory of gases is due 
to Boltzmann. One can obtain his fundamental equation by 
generalizing one of the equivalent formulae (6.4) or (6.7). These 
are based on the assumption that each molecule moves indepen- 
dently of the others according to the laws of mechanics, and they 
describe how the distribution / of an assembly of such particles 
develops in time. Now the assumption of independence is 
dropped, hence the expression on the left-hand side of (6.4) or 
(6.7) is not zero; denoting by /(I) the probability density for a 
certain particle 1, one can write 


dt ~ et ~ L " v v * /J (6 ' 23) 


where (7(1) represents the influence of the other molecules on the 
particle 1; it is called the 'collision integral', as Boltzmann cal- 
culates it only for the case where the orbit of the centre of a 
particle can be described as straight and uniform motion 
interrupted by sudden collisions. For this purpose a new and 
independent application of the laws of probability is made by 
assumingthatthe probability of a collision between two particles 1 
and 2 is proportional to the product of the probabilities of finding 
them in a given configuration, /(l)/(2). If one then expresses that 
some molecules are thrown by a collision out of a given element 
of phase space, others into it, one obtains (see Appendix, 16) 

0(1) = // {/W(2)-/(l)/(2)}|$r-5,| dbdZ z , (6.24) 

where /(2) is the same function as /(I), but taken for the particle 
2 as argument. /(I), /(2) refer to the motion of two particles 
'before' the collision, /'(I), /'(2) to that 'after' the collision; one 
has to integrate over all velocities of the particle 2, (d% 2 ) 9 and over 
the 'cross-section' of the collision, (rfb), which I shall not define 
in detail. 'Before' and 'after' the collision mean the asymptotic 
straight and uniform motions of approach and separation; it is 
clear that if the former is given, the latter is completely deter- 
mined for any law of interaction force it is the two-body problem 
of mechanics. Hence the velocities of both particles !> JE^ after 
the collision are known functions of those before the collision 
!, 5 2 > an d (6.23) assumes the form of an integro-differential 
equation for calculating /. 

This equation has been the object of thorough mathematical 
investigations, first by Boltzmann and Maxwell, and later by 
modern writers. Hilbert has indicated a systematic method of 
solution in which each step of approximation leads to an integral 
equation of the normal (so-called Predholm) type. Enskog and 
Chapman have developed this method, with some modifications, 
in detail. There is an admirable book by Chapman and Cowling 
which represents the whole theory of non -homogeneous gases as 
a consequence of the equation (6.23). I can only mention a few 
points of these important investigations. 

The first is concerned with the question of equilibrium. Does 


the equation (6.23) really indicate an irreversible approach from 
any initial state to a homogeneous equilibrium ? This is in fact 
the case, and a very strange result indeed: the metamorphosis 
of reversible mechanics into irreversible thermodynamics with 
the help of probability. But before discussing this difficult 
question, I shall indicate the mathematical proof. 

From the statistics of equilibrium it is known how the entropy 
is connected with probability, namely by equation (6.21). 
Replace here the discontinuous n k by the continuous / and 
summation by integration over the phase space, and you obtain 

S = -k J/(l)log/(l) dqdp. (6.25) 

If one now calculates the time derivative dS/dt by substituting 
df(l)/dt from (6.23), and assuming no external interference, one 
finds (see Appendix, 17) 

f>0, (6.26) 

where the = sign holds only if /(I) is independent of the space 
coordinates and satisfies, as a function of the velocities, 

/(l)/(2)=/'(l)/'(2) (6.27) 

identically for any collision. 

The result expressed by (6.26) is often quoted as Boltzmann's 
jBT-theorem (because he used the symbol H for S/k). Boltz- 
mann claimed that it gave the statistical explanation of 
thermodynamical irreversibility. 

Equation (6.27) is a functional equation which determines / 
as a function of 'collision invariants', like total energy and total 
momentum. If the gas is at rest as a whole, the only solution of 
(6.27) is Maxwell's (or Boltzmann's) distribution law: 

/=*-*, H(p,q) = . (6.28) 

This is what I called the dynamical proof, and is a most remarkable 
result indeed; for it has been derived from the mechanism of 
collisions, which was completely neglected in the previous 
equilibrium methods. This point needs elucidation. 

Before doing so, let me mention that the hydro-thermal 
equations of a gas, i.e. the equations of continuity, of motion and 


of conduction of heat, are obtained from Boltzmann's equation 
(6.23) by a simple formal process (multiplication with 1,5 and 
|w!- 2 followed by integration over all velocities) in terms of the 
stress tensor T you will remember Cauchy's general formula 
(4.9) which itself is expressed in terms of the distribution 
function /. To give these equations a real meaning, one has to 
expand/ in terms of physical quantities, and this is the object of 
the theories contained in Chapman and Cowling's book. In this 
wayavery satisfactory theory of hydro -thermodynamics of gases, 
including viscosity, conduction of heat, and diffusion, is obtained. 


I remember that forty years ago when I began to read scientific 
literature there was a violent discussion raging about statistical 
methods in physics, especially the H -theorem. The objections 
raised have been classified into two types, one concerning reversi- 
bility, the other periodicity. 

Loschmidt, like Boltzmann, a member of the Austrian school, 
formulated the reversibility objection in this way: by reversing 
all velocities you get from any solution of the mechanical 
equations another one how can the integral $, which depends 
on the instantaneous situation, increase in both cases ? 

The periodicity objection is based on a theorem of the great 
French mathematician Henri Poincar6, which states that every 
mechanical system is, if not exactly periodic, at least quasi- 
periodic. This follows from Liouville's theorem according to 
which a given region in phase space moves without change of 
volume and describes therefore a tube-shaped region of ever 
increasing length. As the total volume available is finite (it is 
contained in the surface of maximum energy), this tube must 
somewhere intersect itself, which means that final and initial 
states come eventually near together. 

Zermelo, a German mathematician, who worked on abstract 
problems like the theory of Cantor's sets and transfinite num- 
bers, ventured into physics by translating Gibbs's work on 
statistical mechanics into German. But he was offended by the 
logical imperfections of this theory and attacked it violently. 
He used in particular Poincar^'s theorem to show how scanda- 


lous the reasoning of the physicists was: they claimed to have 
proved the irreversible increase of a mechanical quantity for 
a system which returns after a finite time to its initial state with 
any desired accuracy. 

These objections were not quite futile, as they led two dis- 
tinguished physicists, Paul Ehrenfest and his wife Tat j ana, to 
investigate and clear up the matter beyond doubt in their well- 
known article in vol. iv of the Mathematical Encyclopedia. 

To-day we hardly need to follow all the logical finesses of this 
work. It suffices to point out that the objections are based on the 
following misunderstanding. If we describe the behaviour of the 
gas (we speak only of this simple case, as for no other case has 
the .ff-theorem been proved until recently) by the equation (6.4) , 
taking for H the Hamiltonian of the whole system, a function of 
the coordinates and momenta of all particles, then / is indeed 
reversible and quasi-periodic, no H- theorem can be proved. 

Boltzmann's proof is based not on this equation, but on (6.23), 
where now H is the Hamiltonian of one single molecule un- 
disturbed by the others, and where the right-hand term is not 
zero but equal to the collision integral C( 1). The latter is taken as 
representing roughly the effect of all the other molecules; 
'roughly', that means after some reasonable averaging. This 
averaging is the expression of our ignorance of the actual micro- 
scopic situation. Boltzmann's theorem says that this equation 
mixing mechanical knowledge with ignorance of detail leads to 
irreversibility. There is no contradiction between the two state- 

But there rises the other question whether such a modification 
of the fundamental equation is justified. We shall see presently 
that it is indeed, in a much wider sense than that claimed by 
Boltzmann, namely not only for a gas, but for any substance 
which can be described by a mechanical model. We have there- 
fore now to take up the question of how statistical methods can 
be applied to general mechanical systems. Without such a 
theory, one cannot even treat the deviations from the so-called 
ideal behaviour of gases (Boyle's law), which appear at high 
pressure and low temperature, and which lead to condensation. 


Theories like that of van der Waals have obviously only pre- 
liminary character. What is needed is a general and well- 
founded formalism covering the gaseous, liquid, and solid states, 
under all kind of external forces. 

For the case of statistical equilibrium, this formalism was 
supplied by Willard Gibbs's celebrated book on Statistical 
Mechanics (1901), which has proved to be extremely successful 
in its applications (see Appendix, 18). The gist of Gibbs's idea 
is to apply Boltzmann's results for a real assembly of many 
equal molecules to an imaginary or Virtual' assembly of many 
copies of the system under consideration, and to postulate that 
the one system under observation will behave like the average 
calculated for the assembly. Before criticizing this assumption, 
let us have a glimpse of Gibbs's procedure. He starts from 
Liouville's theorem (6.4) and considers especially the case of 
equilibrium where the partition function/of his virtual assembly 
has to satisfy equation (6.9). He states that/ = <P(T) is a solu- 
tion (as we have seen) and he chooses two particular forms of 
the function O. The first is 

/ = (H) = const., if E < H < E+&E, (6 2g) 

= outside of this interval, 

where E is a given energy and kE a small interval of energy. 
(In modern notation one could write (H) = 8(HE), where 
8 is Dirac's symbolic function.) The corresponding distribution 
he calls micro-canonical. 

The second form is just that of Maxwell-Boltzmann, 

f=e-fte, H(p,q) = E, (6.30) 

and the corresponding distribution is called canonical. Gibbs shows 
that both assumptions lead to the same results for the averages 
of physical quantities. But the canonical is preferable, as it is 
simpler to handle. /? turns out again to be equal for systems in 
thermal equilibrium; if one puts /? = 1/kT the formal relations 
between the averages constructed with (6.30) are a true replica 
of thermodynamics. For instance, the normalization condition 
for the probability 

jfdpdq = J eP-P* dpdq = 1 (6.31) 


can be written 

J = /0 = kTlog Z, Z = jj e~WMdpdq. (6.32) 

This F plays the part of Helmholtz's free energy. The integral 
Z, to-day usually called the 'partition function', depends, apart 
from the energy E, on molar parameters like the volume F. 
All physical properties can be obtained by differentiation, e.g. 
entropy S and pressure p by 

8F 8F 

s =-^' *=-fr < 6 - 33 > 

This formalism has been amazingly successful in treating thermo- 
mechanical and also thermo-chemical properties. For instance, 
the theory of real (non-ideal) gases is obtained by writing 

E = H(p, q) - K(p)+ U(q), (6.34) 

where K is the kinetic and U the potential energy; the latter 
depends on the mutual interactions of the molecules. As K is 
quadratic in the p, the corresponding integration in Z is easily 
performed and the whole problem reduces to calculating the 
multiple integral 

Q = j ... J e" u ^^'"^ kT dq^dq^dq N . (6.35) 

Still, this is a very formidable task, and much work has been 
spent on it. I shall mention only the investigations initiated by 
Ursell, and perfected by Mayer and others, the aim of which was 
to replace van der Waals's semi-empirical equation of state by 
an exact one. In fact one can expand Q into a series of powers of 
V" 1 and, introducing this into (6.32) and (6.33), one obtains the 
pressure p as a similar series 

where the coefficients -4,-B,..., called virial coefficients, are 
functions of T. One can even go further and discuss the process 
of condensation, but the mathematical difficulties in the treat- 
ment of the liquid state itself are prohibitive. 

The range of application of Gibbs's theory is enormous. But 
reading his book again, I felt the lack of a deeper foundation. 


A few years later (1902, 1903) there appeared a series of papers 
by Einstein in which the same formalism was developed, 
obviously quite independently, as Maxwell and Boltzmann are 
quoted, but not Gibbs; these papers contained two essential 
improvements: an attempt to justify the statistical assumptions, 
and an application to a case which at once transformed the kinetic 
theory of matter from a useful hypothesis into something very real 
and directly observable, namely, the theory of Brownian motion. 

Concerning the foundation, Einstein used an argument which 
Boltzmann had already introduced to support his distribution 
law (6.18) though this seems to be hardly necessary, as for a 
real assembly the method of enumerating distributions over 
cells is perfectly satisfactory. Curiously enough, this argument 
of Boltzmann is based on a theorem similar to Poincare's con- 
siderations on quasi-periodicity with which Zermelo intended 
to smash statistical mechanics altogether. Einstein considers a 
distribution of the micro-canonical type, in Gibbs's nomen- 
clature, where only one 'energy surface' H(p,q) = E in the phase 
space is taken into account. The representative point in phase 
space moves always on this surface. It may happen that the 
whole surface is covered in such a way that the orbit passes 
through every point of the surface. Such systems are called 
ergodic; but it is rather doubtful whether they exist at all. 
Systems are called quasi-ergodic where the orbit comes near to 
every point of the energy surface; that this happens can be seen 
by an argument similar to that which leads to Poincare's theorem 
of quasi-periodicity. Then it can be made plausible that the total 
time of sojourn of the moving point in a given part of the energy 
surface is proportional to its area; hence the time average of any 
function of p, q is obviously the same as that taken with the help 
of a micro-canonical virtual assembly. In this way quasi- 
periodicity is used to justify statistical mechanics, exactly 
reversing Zermelo 's reasoning. This paradox is resolved by the 
remark that Zermelo believes the period to be large and macro- 
scopic, while Einstein assumes it to be unobservably small. 
Who is right ? You may find the obvious answer for yourselves 
(see Appendix, 19). 


Modern writers use other ways of establishing the foundations 
of statistical mechanics. They are mostly adaptations of the 
cell-method to the virtual assembly; one has then to explain why 
the average properties of a single real observed system can be 
obtained by averaging over a great many systems of the virtual 
assembly. Some say simply: As we do not know the real state, 
we have the right to expect the average provided exceptional 
situations are theoretically extremely rare and this is of course 
the case. Others say we have not to do with a single isolated 
system, but with a system in thermal contact with its surround- 
ings, as if it were in a thermostat or heat bath; we can then 
assume this heat bath to consist of a great many copies of the 
system considered, so that the virtual assembly is transformed 
into a real one. I think considerations of this kind are not very 

There remains the fact that statistical mechanics has justified 
itself by explaining a great many actual phenomena. Among 
these are the fluctuations and the Brownian motion to which 
Einstein applied his theory (see Appendix, 20). To appreciate 
the importance of this step one has to remember that at that 
time (about 1900) atoms and molecules were still far from being 
as real as they are to-day there were still physicists who did 
not believe in them. After Einstein's work this was hardly 
possible any longer. Visible tiny particles suspended in a gas or a 
liquid (colloid solution) are test bodies small enough to reveal 
the granular structure of the surrounding medium by their 
irregular motion. Einstein showed that the statistical properties 
of this movement (mean density, mean square displacement in 
time, etc.) agree qualitatively with the predictions of kinetic 
theory. Perrin later confirmed these results by exact measure- 
ments and obtained the first reliable value of Avogadro's num- 
ber N, the number of particles per mole. From now on kinetic 
theory and statistical mechanics were definitely established. 

But beyond this physical result, Einstein's theory of Brownian 
motion had a most important consequence for scientific metho- 
dology in general. The accuracy of measurement depends on the 
sensitivity of the instruments, and this again on the size and 


weight of the mobile parts and the restoring forces acting on 
them. Before Einstein's work it was tacitly assumed that 
progress in this direction was limited only by experimental 
technique. Now it became obvious that this was not so. If an 
indicator, like the needle of a galvanometer, became too small 
or the suspending fibre too thin, it would never be at rest but 
perform a kind of Brownian movement. This has in fact been 
observed. Similar phenomena play a large part in modern 
electronic technique, where the limit of observation is given by 
irregular variations which can be heard as a 'noise' in a loud 
speaker. There is a limit of observability given by the laws of 
nature themselves. 

This is a striking example that the code of rules for inference 
by induction, though perhaps metaphysical in some way, is 
certainly not a priori, but subject to reactions from the know- 
ledge which it has helped to create. For those rules which taught 
the experimentalist how to obtain and improve the accuracy of 
his findings contained to begin with certainly no hint that there 
is a natural end to the process. 

However, the idea of unlimited improvement of accuracy need 
not be given up yet. One had only to add the rule: make your 
measurements at as low a temperature as possible. For Brownian 
motion dies down with decreasing temperature. 

Yet later developments in physics proved this rule also to be 
ineffective, and a much more trenchant change in the code had 
to be made. 

But before dealing with this question we have to finish our 
review of statistical methods in classical mechanics. 


Kinetic theory could only be regarded as complete if it 
applied to matter in (visible) motion as well as to equilibrium. 
But if you look through the literature you will find very little 
a few simple cases. The most important of these, the theory of 
gases, has been dealt with in some detail. Two others must be 
mentioned : the theory of solids and of the Brownian motion. 

Ideal solids are crystal lattices or gigantic periodic molecules. 


But only for zero temperature are the atoms in regularly spaced 
equilibrium positions; for higher temperatures they begin to 
vibrate. As long as the amplitudes are small, the mutual forces 
will be linear functions of them; then the vibrations can be 
analysed into 'normal modes', each of which is a wave running 
through the lattice with a definite frequency. These normal 
modes represent a system of independent harmonic oscillations 
to which Gibbs's method of statistical mechanics can be applied 
without any difficulty. If, however, the temperature rises, the 
amplitudes of the vibrations increase and higher terms appear in 
the interaction: the waves are scattering one another and are 
therefore strongly damped. Hence there exists a kind of free 
path for the transport of energy which can be used for explaining 
conduction of heat in crystals (Debye). Similar considerations, 
applied to the electrons in metallic crystals, are used for the 
explanation of transport phenomena like electric and thermal 
conduction in metals. 

In the case of Brownian motion, I have already mentioned that 
Einstein calculated not only the mean density of a colloidal 
solution, say, under gravity, but also the mean square displace- 
ment of a single suspended particle in time (or, what amounts to 
the same, the dispersion of a colloid by diffusion as a function 
of time). The simplifying assumption which makes this possible 
is that the mass of the colloidal particle is large compared with the 
mass of the surrounding molecules, so that these impart only 
small impulses. Similar considerations have been applied to 
other fluctuation phenomena (see Appendix, 20). 

A great number of more or less isolated examples of non- 
equilibria have been treated by a semi-empirical method which 
uses the notion of relaxation time. You find a very complete 
account of such things related to solids and liquids in a book of 
J. Frenkel, Kinetic Theory of Liquids. But you must not expect 
to find in this work a systematic theory, based on a general idea, 
nor will you find it in any other book. 

My collaborator Dr. Green and I have tried to fill this gap, 
and to develop the kinetic theory of matter in general. I hope 
you will not mind if I indulge a little in the pleasure of explaining 

5131 E 


the leading ideas. It will help the understanding of the interplay 
of cause and chance in the laws of nature. 

We have to remember the general principles laid down by 
Gibbs which he, however, used only for the case of statistical 

An arbitrary piece of matter, fluid or solid, is, from the atomis- 
tic standpoint, a mechanical system of particles (atoms, mole- 
cules) defined by a Hamiltonian H. Its state is completely 
determined if the initial values of coordinates and momenta are 
given. Actually this is not the case; but there is a probability 
(as yet unknown) f(p,q)dpdq for the initial distribution. The 
causal laws of motion demand that the distribution f(t,p>q) at 
a later time t is a solution of the Liouville equation (6.4) (p. 49) 

g -!-[*,/] -0. (6.37) 

namely that solution which for t = becomes /, 

,g)=/(3>,g). (6.38) 

Let us assume for simplicity that all molecules are equal 
particles (point masses) with coordinates x (fc) and velocities 
%w = pw/m. We shall consider/ to be a function of these and 
write/(tf, x,). If wewant to indicate thatafunction/ depends onh 
particles, we do not write all arguments, but simply f h (l, 2,...h) or 
shortly //>. As all the particles are physically indistinguishable, we 
can assume all the functions/^ to be symmetrical in the particles. 

Now the physicist is not directly interested in a symmetric 
solution /# of (6.37). He wants to know such things as the num- 
ber density (number of particles per unit volume) n^(t, x) at a 
given point x of space, or perhaps, in addition, the velocity dis- 
tribution f^t, x, ), i.e. just those quantities which are familiar 
from the kinetic theory of gases, depending on one particle only. 

One has therefore to reduce the function/^ for N particles step 
by step to the function f l of one particle. 

This is done by integrating over the position and velocity of 
one particle, say the last one, with the help of the integral 




i is g iven > we obtain f q by applying the operator X Q +I, it is, 
however, convenient to add a normalizing factor and to write 

(N-q)f t = **/, (6.40) 

The physical meaning of the operation is this: we give up the 
pretence to know the whereabouts of one particle and declare 
frankly our ignorance. By repeated application of the operation, 
we obtain a chain of functions 

/*, /*-i, ., / A, (6-41) 

to which one can add/ = 1 ; f q means the probability of finding 
the system in a state where q particles have fixed positions (i.e. 
lie in given elements). The normalization is such that 

J/ x (*, xa>, |W) dci> = ni (t, x) (6.42) 

is the number density; for one has 


where the last equality follows from (6.40) for q = (with 

/o = 1). 

Now we have to reduce the fundamental equation (6.37) step by 

step by repeatedly applying the operator x ( see Appendix, 21). 
Assuming that the atoms are acting on another with central 
forces, <I> W) being the potential energy between two of them, the 
result of the reduction is a chain of equations of the form 

(q=l,2 9 ...,N), (6.44) 

where S X+it^^VrH]- ( 6 - 45 ) 


This quantity S q will be called the statistical term. What is the 
advantage of this splitting up of the problem into the solution of 
a chain of equations ? The first impression is that there is no 
advantage at all; for to determine/! you need to know S v but S^ 
contains / 2 , and this again depends on/ 3 , and so on, so that one 
eventually arrives at f N , which satisfies the original equation. 
Yet this reasoning supposes the desire to get information about 
every detail of the motion, and that is just what we do not want. 


We wish to obtain some observable and rather crude averages. 
Starting from/! and climbing up to/ 2 ,/ 3 ,..., we can soon stop, as 
the chaos increases with the number of particles, and replace the 
rigorous connexion between f q and/ ff+1 by an approximate one, 
according to the imperfection of observation. 

Before explaining the application of this 'method of ignorance' 
to simple examples, I wish to mention that we have actually 
found the chain of equations (6.44) in quite a different way, 
starting with f l and using the calculus of probability for events 
not independent of one another (see Appendix, 22). 

This derivation is less formal than the first one and illuminates 
the physical meaning of the statistical term. 

It would now be very attractive to show how from this general 
formula (6.44) the mechanical and thermal laws for continuous 
substances can be derived. But I have to restrict myself to a 
few indications concerning the general 'method of ignorance', 
to which I have already alluded. 

The first example is the theory of gases. We have seen that 
this theory is based on Boltzmann's equation (6.23) 


where (7(1) is the collision integral (6.24); 

0(1) = j [/ / (l)/'(2)-/(l)/(2)]|? 1 -? 2 | 

Now (6.46) has the same form as our general formula (6.44) for 
q = 1, provided (7(1) can be identified with 8 V 

Green has shown that this is indeed the case, provided that the 
molecular forces have a small range r ; then one can assume that 
in the gaseous state the probability of finding more than two 
particles in a distance smaller than r is negligible. In other 
words, one can exclude all except 'binary' encounters. Two 
particles outside the sphere of interaction can be regarded as 
independent; hence one has there 

/(l,2)=A(l)./i(2). (6.48) 

This holds also, in virtue of Liouville's theorem, if on the left- 
hand side the positions and velocities refer to a point in th 


interior of the sphere of action while on the right-hand side the 
values on its surface are used. With the help of this fact, the 
integration in S 1 can be performed (see Appendix, 23), and 
leads exactly to the expression (7(1), in which only the * boundary 
values' of the functions/(l) and/(2) on the surface of the sphere 
of action appear. 

Hence the whole kinetic theory of gases is contained as a 
special case in our theory. 

Concerning liquids, one must proceed in a different way, 
because triple and higher collisions cannot be handled with 
elementary formulae. We have adopted a method suggested by 
the American physicist Kirkwood. His formula is a generali- 
zation of (6.48), namely 

23> _/2(2,3)/ 2 (3,l)/ 2 (l,2) 
> >3) -- 

and may be interpreted in different ways, e.g. by saying that the 
occurrences of three pairs of particles (2, 3), (3, 1), (1, 2) at given 
positions and with given velocities are almost independent 
events, because the mutual interactions decrease rapidly with 
the distance. 

Substituting / 3 from (6.49) in $ 2 , one obtains from (6.44), 
(6.45) two integro-differential equations for/! and/ 2 which form 
a closed system and can be solved by suitable approximations. 
(If then / 3 is calculated from the solution f l9 / 2 , with the help of 
(6.49), the relation (6.40) for q = 3 is not necessarily satisfied; 
this is the sacrifice of accuracy introduced by the Kirkwood 

All physical properties of a liquid of the kind discussed here 
(particles with central forces) can be expressed in terms of 
ti 2 (l, 2), a function known to the experimenters in X-ray research 
on liquids as the radial distribution function. The method 
explained leads to explicit formulae for the equation of state 
and the energy; it allows also a discussion of the singularity 
which separates the gaseous and liquid states. But I cannot 
enter into a discussion of details. 

Concerning non-equilibria, one can obtain the differential 


equations for the mechanical and thermal flow in a rigorous way; 
the result has, of course, the form of Cauchy's equations (4.9) 
for continuous media, yet with a stress tensor T^ which can be 
explicitly expressed in terms of the time derivatives of the strain 
tensor (or the space derivative of the velocity) and the gradient 
of temperature. In this way expressions for the coefficients of 
viscosity and thermal conductivity are obtained. They differ 
from the known formulae for gases by the great contribution of 
the mutual forces. Yet again I cannot dwell on this subject 
which would lead us far from the main topic of these lectures, 
to which I propose now to return (see Appendix, 33). 


WHAT can we learn from all this about the general problem of 
cause and chance ? The example of gases has already shown us 
that the introduction of chance and probability into the laws of 
motion removes the reversibility inherent in them; or, in other 
words, it leads to a conception of time which has a definite 
direction and satisfies the principle of antecedence in the 
cause-effect relation. 

The formal method consists in defining a certain quantity, 



and showing that it never decreases in time : dS/dt ^ 0. In the 
case of a gas, the function / was the distribution function f l of 
one single molecule, a function of the point p, q of the phase 
space of this molecule. 

The same integral represents also the entropy of an arbitrary 
system in statistical mechanics, if / is replaced by f N , the 
distribution function in the 22V-dimensional phase space; it 
satisfies all equilibrium relations of ordinary thermodynamics. 

In the case of a gas, the time derivative of 8 could be deter- 
mined with the help of Boltzmann's collision equation, and it 
was found that always 

^ > 0. (7.2) 


I have stressed the point that this is not in contradiction to 
the reversibility of mechanics; for this reversibility refers to a 
distribution function of non-interacting molecules, satisfying 

f=[#J], (7-3) 

while molecules colliding with one another satisfy 


where C is the collision integral. Irreversibility is therefore a 
consequence of the explicit introduction of ignorance into the 
fundamental laws. 

Now the same considerations hold for any system. If we take 
for / the function f N of a closed system of N particles, (7.3) is 
again satisfied, and if its solution is introduced into (7.1), it can 
be easily shown that dS/dt = 0. 

Irreversibility can be understood only by explicitly exempting 
a part of the system from causality. One has to abandon the 
condition that the system is closed, or that the positions and 
velocities of all particles are under control. The remarkable 
thing is that it suffices to assume one single particle beyond con- 
trol. Then we have to do with a system of N+ 1 particles, but 
concentrate our interest only on N of them. The partition func- 
tion of these N particles satisfies the equation (6.44) for q = N: 

% = [H N ,f N ]+8 N , (7.5) 

where 8 N is a certain integral over/^ +1 given by (6.45) for q = N. 
For a solution of this equation (7.5) the entropy is either constant 
or increasing. This is of course a fortiori the case if the system 
of N particles is coupled to more complicated systems out of 
control (see Appendix, 24). 

The increase of S continues until statistical equilibrium is 
reached, and it can be shown that the final distribution is the 
canonical one 

) = E. (7.6) 

This result is, in my opinion, the final answer of the age-old 
question how the reversibility of classical mechanics and the 
Irreversibility of thermodynamics can be reconciled. The latter 
is due to a deliberate renunciation of the demand that in prin- 
ciple the fate of every single particle can be determined. You 
must violate mechanics in order to obtain a result in obvious 
contradiction to it. But one may say: this violation may be 
necessary from practical reasons because one can neither observe 
all particles nor solve the innumerable equations in reality, 
however, the world is reversible, and thermodynamics only a 


trick for obtaining probable, not certain, results. This is the 
standpoint taken in many presentations of statistical mechanics. 
It is difficult to contradict if one accepts the axiom that the 
positions and velocities of all particles can, at least in principle, 
be determined but can this really be maintained? We have 
seen that the Brownian motion sets a limit to all observations 
even on a macroscopic scale. One needs a spirit who can do 
things we could not even do with infinitely improved technical 
means. Further, the idea of a completely closed system is also 
almost fantastic. 

I think that the statistical foundation of thermodynamics is 
quite satisfactory even on the ground of classical mechanics. 

But in fact, classical mechanics has turned out to be defective 
just in the atomic domain where we have applied it. The whole 
situation has therefore to be re-examined in the light of quantum 



IN order not to lose sight of my main subject I have added to the 
heading of each section of these lectures words like 'cause', 
'contiguity', 'antecedence', 'chance'. The one for the present 
section, 'matter', seems to be an intruder. For classical philo- 
sophy teaches that matter is a fundamental conception of a 
specific kind, entirely different from cause, though on the same 
level in the hierarchy of notions: another 'category' in Kant's 
terminology. This doctrine was generally accepted at the time 
before the great discoveries were made of which I have now to 
speak. It was the period when physics was governed by the 
dualism 'force and matter', Kraft und Stoff (the title of a popular 
book by Biichner). In modern physics this duality has become 
vague, almost obsolete. The first steps in this direction have been 
described in the preceding survey: the transition from Newton's 
distance forces to contact forces, first in mechanics, then in 
electromagnetism, and finally for gravitation; in other words, 
the victory of the idea of contiguity. If force is spreading in 
'empty space' with finite velocity, space cannot be quite empty; 
there must be something which carries the forces. So space is 
filled with ether, a kind of substance akin to ordinary matter 
in many respects, in which strains and stresses can be produced. 
Though these contact forces obey different laws from those which 
govern elasticity, they are still forces in an ether, something 
different from the carrier. Yet this distinction vanishes more 
and more. Relativity showed that the ether does not share with 
ordinary matter the property of 'localization 5 : you cannot say 
'here I am' ; there is no physical way of identifying a point in the 
ether, as you could recognize a point in running water by a 
little mark, a particle of dust. Electric and magnetic stresses 
are not something in the ether, they are 'the ether' . The question 
of a carrier becomes meaningless. 

However, this is a question of interpretation. Physicists are 


very broad-minded in this respect; they will continue using obso- 
lete expressions like ether, and no harm is done. For them a 
matter of terminology is not serious until a new quantitative 
law is involved. That has happened here indeed. I refer to the 
law connecting mass m, energy e, and the velocity of light c 
(see Appendix, 25), _ mc ^ (8 x) 

which, after having been found to hold in special cases, was 
generally established by Einstein. His reasoning is based on the 
existence of the pressure of light, demonstrated experimentally 
and also derivable from Maxwell's equations of electrodynamics. 
If a body of mass M emits a well-defined quantity of light in a 
parallel beam which carries the total electromagnetic energy e, 
it suffers a recoil corresponding to the momentum /c transferred. 
It therefore moves in the opposite direction, and to avoid a clash 
with the mechanical law that the centre of mass of a system 
cannot be accelerated by an internal process, one has to ascribe 
to the beam of light not only an energy e and momentum /c but 
also a mass e/c 2 , and to assume that the mass M of the emitting 
body is reduced by the same amount m = e/c 2 .f 

The theory of relativity renders this result quite natural. It 
provides, moreover, an expression for the dependence of mass 
on velocity; one has 

where m is called the rest-mass. Energy e and momentum p are 
then given by f = ^ p = my (g 3) 

I need hardly to remind you how this result of 'purest science' 
has been lately confirmed by a terrifying, horrible, 'technical' 
application in New Mexico, Japan, and Bikini. There is no 
doubt, matter and energy are the same. The old duality between 
the force and the substance on which it acts, has to be aban- 
doned, and hence also the original idea of force as the cause of 
motion. We see how old notions are dissolved by new ex- 
periences. It is this process which has led me to the abstract 

t M. Born, Atomic Physics (Blackie), 4th ed., 1948, Ch. III. 2, p. 52 ; A. VII, 
p. 288. See Appendix, 25. 


definition of causality based only on the notion of physical 
dependence, but transcending special theories which change 
according to the experimental situation. 

Returning to our immediate object, we learn from Einstein's 
law that the atomistic conception of matter is necessarily con- 
nected with the atomistic conception of energy. In fact the 
existence of quanta of energy was deduced by Planck from the 
laws of heat radiation five years before Einstein published his 
relation between mass and energy. 

Planck's discovery opened the first chapter in the history of 
quantum theory, which corresponds to the years 1900 to 1913 
and could be entitled ' Tracing the quantum by thermodynamical 
and statistical methods' . The next chapter deals with the period 
1913-25 when spectroscopical and electronic methods were in 
the foreground, while the last chapter describes the birth and 
development of quantum mechanics. 

I cannot possibly give an account of this long and tedious 
development, but I shall pick out a few points which are not so 
well known and hardly found in text-books, beginning with some 
remarks on the thermo-statistical quantum hunt. 

The problem which Planck solved was the determination of 
the density of radiation p in equilibrium with matter of a given 
temperature T as function of T and of the frequency v, so that 
p(v,T) dv is the energy per unit volume in the frequency interval 
dv. By purely thermodynamical methods several properties of 
this function were known: the temperature dependence of the 
total radiation J p dv = o-T 4 (law of Stefan and Boltzmann) and 
the specification that p/v* is only a function of the quotient vjT.^ 
The problem remained to determine this function, and here 
statistical methods had to be used. 

One can proceed in two ways. Either one regards the radiation 
as being in equilibrium with a set of atoms which in their inter- 
action with radiation can be replaced by harmonic oscillators; 
then the mean energy of these can be calculated in terms of the 
radiation density and turns out to be proportional to it. This 
was the method preferred by Planck. Or one regards the radia- 

f Law of Wien; see Atomic Physics, Ch. VII. 1, p. 198; A. XXVII, p. 343. 


tion itself as a system of oscillators, each of these representing 
the amplitude of a plane wave. This method was used by Ray- 
leigh and later by Jeans. In both cases the relation between the 
mean energy u(v) of the oscillators of frequency v and the radia- 
tion density p is given byf 

877V 2 , QA . 

p-~jru, (8.4) 

and it suffices to determine u. 

This can be done with the help of the so-called equipartition 
law of statistical mechanics. Suppose the Hamiltonian H of a 
system has the form 


where is any coordinate or momentum and H' contains all 
the other coordinates and momenta but not . Then the mean 
value of the contribution to the energy of this variable is (see 
Appendix, 26) , 

|^|T, (8.6) 

independent of the constant a hence the same for all variables 
of that description. 

Applied to a set of oscillators of frequency v, where 

H = -(p'+^W), (8.7) 

one obtains for the average energy 

u = kT, (8.8) 

hence, from (8.4), p = kT. (8.9) 

This is called the Rayleigh-Jeans radiation formula. It is a 
rigorous consequence of classical statistical mechanics, but 
nevertheless in obvious contradiction to facts. It does not even 
lead to a finite total radiation, since p increases as v 2 with fre- 
quency. The law is, however, not quite absurd as it agrees well 
with measurements for small frequencies (long waves) or high 
temperatures. At the other end of the spectrum, the observed 

t See Atomic Physics, Ch. VII. 1, p. 201 ; A. XXVIII, p. 347. 


energy density decreases again, and Wien has proposed for this 
region an experimental law which would correspond to the 
assumption that in (8.4) the oscillator energy is of the form 

u = u e-^ kT . (8.10) 

This looks very much like a Boltzmann distribution. According 
to Wien's displacement law it holds for high values of the 



quotient v/T, and both constants U Q and c must be proportional 
to v; but their meaning is obscure. 

This was the situation which Planck encountered: two 
limiting cases given by the formulae (8.8) and (8.10), the first 
valid for large T, the second for small T. Planck set out to dis- 
cover a bridging formula; the difficulty of this task can be visual- 
ized by looking at the two mathematical expressions or the 
corresponding graphs in Fig. 1. Planck decided that the energy 
was a variable unsuited for interpolation, and he looked for 
another one. He found it in the entropy 8. I shall give here 
his reasoning in a little different form (due to Einstein, 1905), 
where the entropy does not appear explicitly but the formulae 
of statistical mechanics are used. Starting from Boltzmann's 


distribution law, according to which the probability of finding 
a system in a state with energy c is proportional to e~0 , where 
j8 = I/kT, one can express the mean square fluctuation of the 
energy ^ = ^^ = ? __- 2 

in terms of the average energy e = u itself if the latter is given as 
function of temperature or of /? (see Appendix, 20.10): 

(2F = -^. (8.11) 

Now this function tt(/J) is known for the two limiting cases: T 
large or /J small, and T small or j8 large, from (8.8) and (8.10), 

/jS- 1 for small & 
U e-0<. for large/?. (8J2) 

Hence one has 

3 ' 2 = 2 ' 8ma11 # , 8 13> 

-*' = eo , large jB. (8J3) 

Now Planck argues like this: the two limiting cases will corre- 
spond to the preponderance of two different causes, whatever 
they may be. A well-known theorem of statistics says that the 
mean square fluctuations due to independent causes are additive. 
Let us assume that the condition of independence is here satisfied. 
Hence, if both causes act simultaneously, we should have 

. (8.14) 

This is a differential equation for u, with the general solution 

v ' 

The constant of integration a must vanish in order to have the 
limiting cases (8.15) all right. Wien's displacement law, accord- 
ing to which p/v 8 = 8iru/c 3 v depends only on T\v, leads then to 
e == hv, where h is a constant, known as Planck's constant. 
The result is Planck's formula for the mean oscillator energy 

scri- ^W' < 8 - 16 ' 

from which the radiation density follows according to (8.9); a 


refined interpolation which turned out to be in so excellent agree- 
ment with experiment that Planck looked for a deeper explana- 
tion and discovered it in the assumption of energy quanta of 
finite size e = hv. If the energy is a multiple of e , the integral 
(6.32) has to be replaced by the sum 

and then the usual procedure outlined in section 6 leads at once 
to the expression (8.16) for the oscillator energy u. 

Planck believed that the discontinuity of energy was a pro- 
perty of the atoms, represented by oscillators in their interaction 
with radiation, which itself behaved quite normally. Seven 
years later Einstein showed that indeed wherever oscillations 
occur in atomic systems, their energy follows Planck's formula 
(8.16);I refer to his theory of specific heat of molecules and solids 
which opened more than one new chapter of physics. But this 
is outside the scope of these lectures. f 

Einstein had, however, arrived already in 1905 at the con- 
clusion that radiation itself was not as innocent as Planck 
assumed, that the quanta were an intrinsic property of radia- 
tion and ought to be imagined to be a kind of particles flying 
about. In text-books this revival of Newton's corpuscular theory 
of light is connected with Einstein's explanation of the photo- 
electric effect and similar phenomena where kinetic energy of 
electrons is produced by light or vice versa. This is quite correct, 
but not the whole story. For it was again a statistical argument 
which led Einstein to the hypothesis of quanta of light, or 
photons, as we say to-day. 

He considered the two limiting cases (8.13) from a different 
point of view. Suppose the wave theory of light is correct, then 
heat radiation is a statistical mixture of harmonic waves of all 
directions, frequencies, and amplitudes. Then one can deter- 
mine the mean energy of the radiation and its fluctuation in a 
given section of a large volume. This calculation has been per- 
formed by the Dutch physicist, H. A. Lorentz, with the result 

t See Atomic Physics, Ch. VII. 2, p. 207. 


that (Ap) 2 = p 2 for any frequency, or expressed in terms of the 
equivalent oscillators, (Ac) 2 = u 2 , in agreement with the 
Rayleigh-Jeans case (small j9, large T) in (8.13). Hence there 
must be something else going on besides the waves, for which 
(Ae) 2 = u- 9 what can this be? 

Suppose Planck's quanta exist really in the radiation and let 
n be their number per unit volume and unit frequency interval. 
As each quantum has the energy c = hv, one has e = u = we , 
and (Ae) 2 = (A?i) 2 . Hence the fluctuation law in Wien's case 
(large j3, small T) of (8.13) can be written as 

(Snj* = n. (8.18) 

This is a well-known formula of statistics referring to the 
following situation: a great number of objects are distributed at 
random in a big volume and n is the number contained in a part. 
Then one has just the relation (8.18) between the average n 
and its mean square fluctuation (see Appendix, 20). So Einstein 
was led to the conclusion that the Wien part of the fluctuation 
of energy is accounted for by quanta behaving like independent 
particles, and he corroborated it by taking into account, besides 
the energy, also the momentum hv/c of the quantum and the 
recoil of an atom produced by it. It was this result which en- 
couraged him to look for experimental evidence and led him to 
the well-known interpretation of the photo-electric effect as a 
bombardment of photons which knock out electrons from the 
metal transferring their energy to them. 

Expressed in terms of photon numbers the combined fluctua- 
tion law (8.14) reads 


with the general solution 

* - ^n> < 8 - 20 > 

where a = leads to the correct value for large T. But what if 

Every physicist glancing at the last formula will recognize 
it as the so-called Bose-Einstein distribution law for an ideal 


gas of indistinguishable particles according to quantum theory. 
It is most remarkable that at this early stage of quantum theory 
Planck and Einstein have already hit on a result which was 
rediscovered much later (Einstein again participating) (see 
Appendix, 25, 32). In fact Planck's interpolation can be inter- 
preted in modern terms as the first and completely successful 
attempt to bridge the gulf between the wave aspect and the 
particle aspect of a system of equal and independent components 
whatever they may be photons or atoms. 

I shall conclude this section by giving a short account of 
another consideration of Einstein's which belongs to a later 
period of quantum theory, when Bohr's theory of atoms was 
already well established, namely the existence of stationary 
states in the atoms which differ by finite amounts of energy 
content. Suppose the atom can exist in a lower state 1 and a 
higher state 2; transitions are possible by emission or absorp- 
tion of a light quantum of energy 2 1 = , hence of frequency 
v = /h. On the other hand, according to Boltzmann's law 
the relative number of atoms in the two states will be 

= e-*'. (8.21) 


Now one can write (8.20), with a = 0, in the form 

fa = n, 
or, using (8.21), nN z +N z = nN v (8.22) 

For this equation Einstein gave the following interpretation: 
the left-hand side represents the number of quanta emitted per 
unit of time from the N 2 atoms in the higher state, the right- 
hand side those absorbed by the N : atoms in the lower state, 
two processes which in equilibrium must of course cancel one 

The absorption is obviously proportional to the number of 
atoms in the lower state, N l9 and to the number n of photons 
present, i.e. to nN v Concerning the emission the term N 2 signifies 
a spontaneous process, independent of the presence of radiation; 
it corresponds to the well-known emission of electromagnetic 


waves by a vibrating system of charges. The other term nN 2 is 
a new phenomenon which was signalled the first time in this paper 
of Einstein (later confirmed experimentally), namely a forced 
or induced emission proportional to the number of photons 

If we denote the number of spontaneous emissions by AN 2> 
of induced emissions by B 21 N 2 n, of absorptions B^N t n y we 
learn from (8.22) that the probability coefficients (probabilities 
per unit time, per atom, and per light-quantum) are all equal: 

A = B 12 = B 2l . (8.23) 

This result had far-reaching consequences. The first is the exis- 
tence of a symmetric probability coefficient J5 12 = B 2 i for transi- 
tion between two states induced by radiation. This became one 
of the clues for the discovery of the matrix form of quantum 

The second point is seen if one considers, not equilibrium, but 
a process in time; Einstein's consideration leads at once to the 

% = f = -f = ^m-*>+*. < 8 - 2 *) 

which is of the type used by the chemists for the calculation of 
reaction velocities. One has, in their terminology, three com- 
peting reactions, namely two diatomic ones and one monatomic 
one. Now genuine monatomic reactions are rare in ordinary 
chemistry, but abundant in nuclear chemistry; they were in 
fact until recently the only known ones, namely the natural 
radioactive disintegrations. If the radiation density is zero, 
n = 0, one has , 

-*J! = AN t , (8.25) 

which is exactly the elementary law of radioactive decay, 
according to Rutherford and Soddy . It expresses the assumption 
that the disintegrations are purely accidental and completely 
independent of one another. 

Thus Einstein's interpretation means the abandonment of 
causal description and the introduction of the laws of chance 
for the interaction of matter and radiation. 



Although my programme takes me through the whole history 
of physics, I am well aware that it is a very one-sided account of 
what really has happened. It will not have escaped you that I 
believe progress in physics essentially due to the inductive 
method (of which I hope to say a little more later), yet the experi- 
mentalist may rightly complain that his efforts and achieve- 
ments are hardly mentioned. Yet as I am concerned with the 
development of ideas and conceptions, I may be permitted 
to take the skill and inventive genius of the experimenters for 
granted and to use their results for my purpose without detailed 

The period about 1900, when quantum theory sprang from 
the investigations of radiation, was also full of experimental 
discoveries: radioactivity, X-rays, and the electron, are the 
major ones. 

In regard to the role of chance in physics, radioactivity was of 
special importance. As I said before, the law of decay is the 
expression of independent accidental events. Moreover, the 
decay constant turned out to be perfectly insensitive to all 
physical influences. There might be, of course, some internal 
parameters of the atom which determine when it will explode. 
Yet the situation is different from that in gas theory: there we 
know the internal parameters, or believe we know them, they 
are supposed to be ordinary coordinates and momenta; what 
we do not know are their actual values at any time, and we are 
compelled to take refuge in statistics because of this lack of 
detailed knowledge. In radioactivity, on the other hand, nobody 
had an idea what these parameters might be, their nature itself 
was unknown. However, one might have kindled the hope that 
this question would be solved and radioactive statistics reduced 
to ordinary statistical mechanics. In fact, just the opposite has 

Radioactivity is also important for our problem because it 
provided the means of investigating the internal structure of the 
atom. You know how Rutherford used a-particles as projectiles 
to penetrate into the interior of the atom, and found the nucleus. 


This result, together with J. J. Thomson's discovery of the elec- 
tron, led to the planetary model of the atom: a number of elec- 
trons surrounding the nucleus, bound to it by electric forces. The 
fundamental difficulty of this model is its mechanical instability. 

As long as nothing was known about the forces which keep the 
elementary particles in an atom together one could assume a 
law of force which allowed stable equilibrium states. An in- 
genious model of this kind is due to J. J. Thomson. But now one 
knew that the forces were electrostatic ones, following Coulomb's 
law, and these could never guarantee the extraordinary stability 
of the actual atoms which survive billions of collisions without 
any change of structure. Bohr connected this difficulty with the 
facts of spectroscopy, and the result was his well-known model 
of the atom consisting of 'quantized' electronic orbits. 

Mentioning spectroscopy, I feel again sadly how I have to 
skip over great fields of research with a few words. 

The discovery of simple laws in line spectra was in fact a great 
achievement. Still more important than numerical formulae, 
like the one discovered by the Swiss schoolmaster Balmer for 
the hydrogen spectrum, was the rule found by Ritz (also a 
Swiss, who unfortunately died quite young), the so-called com- 
bination principle; it says that the frequencies of the spectral 
lines of gases can be obtained by forming differences of a single 
row of quantities T l9 T 2 , jP 3 ,..., which are called terms: 

>W = ?n-T m , (8.26) 

though not all of these differences appear as lines in the spectrum. 
Balmer 5 s formula for hydrogen is a special case where T n = Rjn 2 , 

The formula (8.26) gave Bohr the clue to the application of 
quantum theory. Multiplying it by Planck's constant h he 
interpreted it as the energy difference e nm = hv nm between any 
two stationary states having the energies e n hT n (n = 1,2,...). 
This interpretation is a sweeping generalization of Planck's 
original conception of discrete energy-levels of oscillators. It 
explained at once the stability of atoms against impacts with an 


energy smaller than a certain threshold, the difference between 
emission and absorption spectra (the latter being of the form 
hv nl = n v where 1 means the ground state), and was in 
detail confirmed by the well-known experiments of Pranck and 
Hertz (excitation of spectra by electron bombardment). 

However, I cannot continue to describe the whole develop- 
ment of quantum theory because that would mean writing an 
encyclopaedia of physics of the last thirty-five years. I have 
given this short account of the initial period because it is fashion- 
able to-day to regard physics as the product of pure reason. 
Now I am not so unreasonable as to say that physics could 
proceed by experiment only, without some hard thinking, nor 
do I deny that the forming of new concepts is guided to some 
degree by general philosophical principles. But I know from my 
own experience, and I could call on Heisenberg for confirmation, 
that the laws of quantum mechanics were found by a slow and 
tedious process of interpreting experimental results. I shall try to 
describe the main steps of this process in the shortest possible way. 

Yet it must be remembered that these steps do not form a 
straight staircase upwards, but a tangle of interconnected alleys. 
However, I must begin somewhere. 

There was first the question whether the stationary states are 
certain selected mechanical orbits, and if so, which. Proceeding 
from example to example (oscillator, rotator, hydrogen atom), 
'quantum conditions' were found (Bohr, Wilson, Sommerfeld) 
which for every periodic coordinate q of the motion can be 
expressed in the form 

I =:pdq = hn, (8.27) 

where p is the momentum corresponding to q and the integration 
extended over a period. The most convincing theoretical 
argument for choosing these integrals / was given by Ehrenfest, 
who showed that if the system is subject to a slow external per- 
turbation, / is an invariant and therefore well suited to be 
equated to a discontinuous 'jumping' quantity Tin. 

Among these 'adiabatic invariants' / there is in particular the 
angular momentum of a rotating system and its component in a 


given direction; if both are to be integer multiples of h, the strange 
conclusion is obtained that an atom could not exist in all orienta- 
tions but only in a selected finite set. This was confirmed by 
Stern and Gerlach's celebrated experiment (deflecting an 
atomic beam in an inhomogeneous magnetic field). I am proud 
that this work was done in my department in Frankfort-on- 
Main. There is hardly any other effect which demonstrates the 
deviations from classical mechanics in so striking a manner. 

A signpost for further progress was Bohr's correspondence 
principle. It says that, though ordinary mechanics does not 
apply to atomic processes, we must expect that it holds at least 
approximately for large quantum numbers. This was not so 
much philosophy as common sense. Yet in the hands of Bohr 
and his school it yielded a rich harvest of results, beginning with 
the calculation of the constant M in the Balmer formula. f The 
mysterious laws of spectroscopy were reduced to a few general 
rules about the energy -levels and the transitions between them. 
The most important of these rules was Pauli's exclusion prin- 
ciple, derived from a careful discussion of simple spectra; it says 
that two or more electrons are never in the same quantum state, 
described by fixed values of the quantum numbers (8.27) 
belonging to all periods, including the electronic spin (Uhlenbeck 
and Goudsmit). With the help of these simple principles the 
periodic system of the elements could be explained in terms of 
electronic states. But all these great achievements of Bohr's 
theory are outside the scope of our present interest. I have, 
however, to mention Bohr's considerations about the correspon- 
dence between the amplitudes of the harmonic components of a 
mechanical orbit and the intensity of certain spectral lines. 
Consider an atom in the quantum state n with energy c n and 
suppose the orbit can be, for large n, approximately described 
by giving the coordinates q as functions of time. As these will 
be periodic, one can represent q as a harmonic (Fourier) series, 
of the type 

<?(*) = f o ro ()cos[2 w ()M+S m )], (8.28) 

t See Atomic Physics, Ch. V. 1, p. 98; A. XIV, p. 300. 


where the fundamental frequency v(n) and the amplitudes 
a m (n) depend on the number n of the orbit considered. In reality, 
the frequencies observed are not v(ri), 2v(n), 3i>(n),... but 

_!/ _ v 

v nm % \ n m/ 

and what about the amplitudes ? It was clear that the squares 
\a m (ri) | 2 should correspond in some way to the transition prob- 
abilities B nm = B mn introduced by Einstein in his derivation 
of Planck's radiation law (8.16). But how could the mth over- 
tone of the nth orbit be associated with the symmetric relation 
between two states m, n ? 

This was the central problem of quantum physics in the years 
between 1913 and 1925. In particular there arose a great interest 
in measuring intensities of spectral lines, with the help of newly- 
invented recording micro-photometers. Simple laws for the 
intensities of the component lines of multiplets were discovered 
(Ornstein, Moll), and presented in quadratic tables which look 
so much like matrices that it is hard to understand why this 
association of ideas did not happen in some brain. 

It did not happen because the mind of the physicist was still 
working on classical lines, and it needed a special effort to get 
rid of this bias. One had to give up the idea of a coordinate 
being a function of time, represented by a Fourier series like 
(8.28); one had to omit the summation in this formula and to 
take the set of unconnected terms as representative of the 
coordinate. Then it became possible to replace the Fourier 
amplitudes a m (n) by quantum amplitudes a(m,n) with two 
equivalent indices m, n, and to generalize the multiplication law 
for Fourier-coefficients into that for matricesf 

* c ( m > n ) = 2 a ( m > k ) b ( k > n )' ( 8 - 29 ) 

Heisenberg justified the rejection of traditional concepts by a 
general methodological principle: a satisfactory theory should 
use no quantities which do not correspond to anything ob- 
servable. The classical frequencies mv(ri) and the whole idea of 
orbits have this doubtful character. Therefore one should 

f See Atomic Physics, Ch. V. 3, p. 123; A. XV, p. 305. 


eliminate them from the theory and introduce instead the 
quantum frequencies v nm = A~ 1 ( n c m ), while the orbits should 
be completely abandoned. 

This suggestion of Heisenberg has been much admired as the 
root of the success of quantum mechanics. Attempts have been 
made to use it as a guide in overcoming the difficulties which 
have meanwhile turned up in physics (in the application of 
quantum methods to field theories and ultimate particles); yet 
with little success. Now quantum mechanics itself is not free 
from unobservable quantities. (The wave-function of SchrO- 
dinger, for instance, is not observable, only the square of its 
modulus.) To rid a theory of all traces of such redundant con- 
cepts would lead to unbearable clumsiness. I think, though 
there is much to be said for cleaning a theory in the way recom- 
mended by Heisenberg, the success depends entirely on scientific 
experience, intuition, and tact. 

The essence of the new quantum mechanics is the representa- 
tion of physical quantities by matrices, i.e. by mathematical 
entities which can be added and multiplied according to well- 
known rules just like simple numbers, with the only difference 
that the product is non-commutative. For instance, the quan- 
tum conditions (8.27) can be transcribed as the commutation law 


The Hamiltonian form of mechanics can be preserved by re- 
placing all quantities by the corresponding matrices. In par- 
ticular the determination of stationary states can be reduced 
to finding matrices q,p for which the Hamiltonian H(p,q) 
as a matrix has only diagonal elements which are then the energy- 
levels of the states. In order to obtain the connexion with 
Planck's theory of radiation, the squares \q(m,n)\* have to be 
interpreted as Einstein's coefficients B mn . In this way a few 
simple examples could be satisfactorily treated. But matrix 
mechanics applies obviously only to closed systems with discrete 
energy-levels, not to free particles and collision problems. 

This restriction was removed by SchrOdinger's wave mechan- 
ics which sprang quite independently from an idea of de Broglie 


about the application of quantum theory to free particles. It is 
widely held that de Broglie's work is a striking example of the 
power of the human mind to find natural laws by pure reason, 
without recourse to observation. I have not taken part in the 
beginnings of wave mechanics, as I have in matrix mechanics, 
and cannot speak therefore from my own experience. Yet I 
think that not a single step would have been possible if some 
necessary foothold in facts had been missing. To deny this 
would mean to maintain that Planck's discovery of the quantum 
and Einstein's theory of relativity were products of pure think- 
ing. They were interpretations of facts of observation, solutions 
of riddles given by Nature difficult riddles indeed, which only 
great thinkers could solve. 

De Broglie observed that in relativity the energy of a particle 
is not a scalar, but the fourth component of a vector in space- 
time, whose other components represent the momentum p; on 
the other hand, the frequency v of a plane harmonic wave is also 
the fourth component of a space-time vector, whose other 
components represent the wave vector k (having the direction 
of the wave normal and the length A" 1 , where A is the wave- 
length). Now if Planck postulates that = Jiv, one is compelled 
to assume also p = hk. For light waves where Ai/ = c, this had 
already been done by Einstein, who spoke of photons behaving 
like darts with the momentum p = e/c = hv/c. De Broglie 
applied it to electrons where the relation between e and p is 
more complicated, namely obtained from (8.3) by eliminating 
the velocity v: 


If a particle (c,p) is always accompanied by a wave (v, k) the 
phase velocity of the wave would be (using e = me 2 , p = mv) 

v \ = v /k = /p = c 2 /v > c, (8.32) 

apparently an impossible result, as the principle of relativity 
excludes velocities larger than that of light. But de Broglie 
was not deterred by this; he observed that the prohibition of 
velocities larger than c refers only to such motions which can be 
used for sending time-signals. That is impossible by means 


of a monochromatic wave. For a signal one must have a small 
group of waves, the velocity of which can be obtained, according 
to Rayleigh, by differentiating frequency with respect to wave 
number. Thus, from (8.31) and (8.32), f 

a most satisfactory result which completely justifies the formal 
connexion of particles and waves, though the physical meaning 
of this connexion was still mysterious. 

This reasoning is indeed a stroke of genius, yet not a triumph 
of a priori principles, but of an extraordinary capacity for com- 
bining and unifying remote subjects. 

I should say the same about the work of Schrftdinger and Dirac, 
but you could better ask them directly what they think about the 
roots of their discoveries. I shall not describe them here in detail, 
but indicate some threads to other facts or theories. Schr5dinger 
says that he was stimulated by a remark of de Broglie that any 
periodic motion of an electron must correspond to a whole 
number of waves of the corresponding wave motion. This led 
him to his wave equation whose eigenvalues are the energy- 
levels of stationary states. He was further guided by the analogy 
of mechanics and optics known from Hamilton's investigations; 
the relation of wave mechanics to ordinary mechanics is the 
same as that of undulating optics to geometrical optics. Then, 
looking out for a connexion of wave mechanics with matrix 
mechanics, Schr6dinger recognized as the essential feature of a 
matrix that it represents a linear operator acting on a vector 
(one-column matrix), and came in this way to his operator cal- 
culus (see Appendix, 27) ; if a coordinate q is taken as an ordinary 
variable and the corresponding momentum as the operator 

the commutation law (8.30) becomes a trivial identity. Apply- 
ing the theory of sets of ortho-normal functions, he could then 
establish the exact relation between matrix and wave mechanics. 

f See Atomic Physics, Ch. IV. 5, p. 84; A. XI, p. 295. 


It is most remarkable that the whole story has been developed 
by Dirac from Heisenberg's first idea by an independent and 
formally more general method based on the abstract concept of 
non-commuting quantities (^-numbers). 

The growth of quantum mechanics out of three independent 
roots uniting to a single trunk is strong evidence for the inevita- 
bility of its concepts in view of the experimental situation. 

From the standpoint of these lectures on cause and chance it 
is not the formalism of quantum mechanics but its interpretation 
which is of importance. Yet the formalism came first, and was 
well secured before it became clear what it really meant: nothing 
more or less than a complete turning away from the predomi- 
nance of cause (in the traditional sense, meaning essentially 
determinism) to the predominance of chance. 

This revolution of outlook goes back to a tentative interpreta- 
tion which Einstein gave of the coexistence of light waves and 
photons. He spoke of the waves being a 'ghost field' which has 
no ordinary physical meaning but whose intensity determines 
the probability of the appearance of photons. This idea could be 
transferred to the relation of electrons (and of material particles 
in general) t<fde Broglie's waves. With the help of SchrOdinger's 
wave equation, the scattering of particles by obstacles, the excita- 
tion laws of atoms under electron bombardment, and other 
similar phenomena could be calculated with results which con- 
firmed the assumption. 

I shall now describe the present situation of the theory in a 
formulation due to Dirac which is well adapted to comparing the 
new statistical physics with the old deterministic one. 



IN quantum mechanics physical quantities or observables are 
not represented by ordinary variables, but by symbols which 
have no numerical values but determine the possible values of 
the observable in a definite way to be described presently. These 
symbols can be added and multiplied with the proviso that 
multiplication is non-commutative: AB is in general different 
from BA. I cannot deal with the most general aspect of this 
symbolic calculus, but shall consider a special representation, 
namely that where the coordinates #i,#2>-"> of the particles are 
regarded as ordinary numbers. Then a definite state of a system 
is defined by a function ^(#i,32---)> an( ^ an bservable A can be 
represented by a linear operator: Aifj(q) means a new function 
<f>(q), the result of operating with A on 0. If this result is, apart 
from a factor, identical with ^, 

A$ = aifj, (9.1) 

</f is called an eigenf unction of A and the constant a an eigenvalue. 
The whole set of eigenvalues is characteristic for the operator 
A and represents the possible numerical values of the observable, 
which may be continuous or discontinuous. 

The coordinates q themselves can be considered to be opera- 
tors, namely multiplication operators: q a operating on \jj 
means multiplying by q a . Operators whose eigenvalues are 
all real numbers are called realtor TEteimitian*) operators. It 
is clear that aH physical quantities have to be represented by real 
operators, as the eigenvalues are supposed to represent the pos- 
sible results of measuring a physical quantity. One can easily 
see that not only the multiplication operators q a but also the 

ft ft 
momenta p ot = ~ are real. But for the formal argument one 

* lat 

can also use complex operators, of the form C == A+iB (where 
i = V 1), and its conjugate (7* = A%B\ then CC* can be 


shown to be a real operator with only positive (or zero) eigen- 

If two observables are represented by non- commuting opera- 
tors, A and J?, their eigenfunctions are not all identical; if a is 
an eigenvalue of A belonging to such an eigenfunction, there is 
no state of the system for which a measurement can result in 
finding simultaneously for A and B sharp numerical values a 
and 6. 

The theory cannot therefore in general predict definite 
values of all physical properties, but only probability laws. 
The same experiment, repeated under identical and controllable 
conditions, may result in finding for a quantity A so many 
times a v so many times a 2 , etc., and for B in the same way b 
or 6 2 > e tc. But the average of repeated measurements must be 
predictable. Whatever the rule for constructing the number 
which represents the average A of the measurements of A, it 
must, by common sense, have the properties that A+B = A + B 
and cZ = cA, if c is any number. 

From this alone there follows an important result. Consider 
apart from the averages A, B of two operators A, B also their 
mean square deviations, or the 'spreading' of the measurements, 

8^ = J{(AA)*}, 8B = V{(--B) 2 }, (9.2) 

then by a simple algebraic reasoning (see Appendix, 28), which 
uses nothing other than the fact stated above that (7(7* has no 
negative eigenvalues, hence (7(7* ^ 0, it is found that 


where [A, #] = ^ (AB-BA) (9.4) 

is the so-called 'commutator' of the two operators A, B. If this 
is specially applied to a coordinate and its momentum, A = p, 
B = g, one has [q,p] = 1, therefore 


This is Heisenberg's celebrated uncertainty principle which is 


a quantitative expression for the effect of non-commutation on 
measurements, but independent of the exact definition of aver- 
ages. It shows how a narrowing of the range for the measured 
g-values widens the range for p. The same holds, according to 
(9.3), for any two non-commuting observables with the differ- 
ence that the 'uncertainty' depends on the mean of the oommu- 

These general considerations are, so to speak, the kinematical 
part of quantum mechanics. Now we turn to the dynamical 

Just as in classical mechanics, the dynamical behaviour of a 
system of particles is described by a Hamiltonian 

which is a (differential) operator. It is usually just taken over 
from classical mechanics (where, if necessary, products like pq 
have to be 'symmetrized' into %(pq+qp)). In Dirac's relativistic 
theory of the electron there are, apart from the space coordinates, 
observables representing the spin (and similar quantities in 
meson theory); they lead to no fundamental difficulty and will 
not be considered here. 

Yet one remark about the Hamiltonian H has to be made, 
bearing on our general theme of cause and chance: H contains 
in the potential energy (and in corresponding electromagnetic 
interaction terms) the last vestiges of Newton's conception of 
force, or, using the traditional expression, of causation. We 
have to remember this point later. 

In classical mechanics we have used a formulation of the laws 
of motion which applies just as well to a simple system, where all 
details of the motion are of interest, as to a system of numerous 
particles, where only statistical results are desired (and possible). 
A function f(t,p,q) of time and of all coordinates and momenta 
was considered; if p,q change with time according to the equa- 
tions of motion, the total change of/ is given by 


where [H t f] is the Poisson bracket 

- (9>7) 


One recovers the canonical equations by taking for / 
simply q k or p k respectively. On the other hand, if one puts 
df/dt = 0, any solution of this equation is an integral of the 
equations of motion, and from a sufficient number of such 
integrals fk(t,p,q) = c k one can obtain the complete solution 
giving all p y q as functions of t. 

But if this is not required, the same equation is also the means 
for obtaining statistical information in terms of a solution /, 
called the * distribution function', as I have described in detail. 
/ is that integral of ^ /. 

f = [H,n, (9-8) 

which for t = goes over into a given initial distribution f (p, q). 
If, in particular, this latter function vanishes except in the 
neighbourhood of a given point p , q Q in phase space, or, in Dirac's 
notation, if / = 8(p p^Siqqo), one falls back to the case of 
complete knowledge, g and p Q being the initial values of q and p. 
This procedure cannot be transferred without alteration to 
quantum mechanics for the simple reason that p and q cannot 
be simultaneously given fixed values. The uncertainty relation 
(9.5) forbids the prescribing of sharp initial values for all p and q. 
Hence the first part of the programme, namely a complete 
knowledge of the motion in the same sense as in classical mechan- 
ics, breaks down right from the beginning. Yet the second part, 
statistical prediction, remains possible. Following Dirac, we 
ask which quantities have to replace the Poisson brackets (9.7) 
in quantum theory, where all quantities are in general non- 
commuting. These brackets [a, /?] have a number of algebraic 
properties; the most important of them being 

[, A+A] = [, Al+[. A]. (9 9) 

[,AA1 = A[,A]+[,A]A. 

If one postulates that these shall hold also for non-commuting 
quantities a and /?, provided the order of factors is always 


preserved (as it is in (9.9)), then it can be shown (Appendix, 29) 
that [a,jS] is exactly the commutator as defined by (9.4). 

Now one has to replace the function / in (9.8) by a time- 
dependent operator p, called the statistical operator, and to 
determine p from the equation (formally identical with (9.8)): 

% = [B, P ] (9.10) 

with suitable initial conditions. To express these in a simple 
way it is convenient to represent all operators by matrices in 
the g-space; A operating on a function iff(q) is defined by 

Aj(q) = j A(q,q'W(q') dq' (9.11) 

(q stands for all coordinates ? 1> 9 8 > > an( * ?' ^ or another set of 
values <?i,#2,...) where A(q,q') is called the matrix represent- 
ing A. 

The product AB is represented by the matrix 

A B(q, q') = J A (q, q")B(q", q') dq". (9.12) 

If now p and H are taken as such matrices, where the elements 
of p depend also on time, (9.10) is a differential equation for 
p(t,q y q'), and the initial conditions are simply 

p(o,g,?') = Po(?,fia (9-13) 

where p is a given function of the two sets of variables. 

The number of vector arguments in p for a system of N par- 
ticles is 2N, exactly as in the case of classical theory in the func- 
tion f(p,q). But while the meaning of/ depending on p,q is 
obvious, that of p depending on two sets q, q' is not, except in 
one case, namely when the two sets are identical, q = q'; then 
the function P (t,q,q) = n(t,q) (9.14) 

is the number density, corresponding to the classical 
jf(t>3>P)dp = n(t,q). 

Quite generally, the classical operation of integrating over the p'& 
is replaced by the simpler operation of equating the two sets 
ofj's, q = q r , or in matrix language, taking the diagonal elements 
of p. 



The average of an observable A for a configuration q must be 
a real number A formed from p and A so that nA is linear in both 
operators. The simplest expression of this kind is 

nl = HpA+Ap)^, (9.15) 

and this gives, in fact, all results of quantum mechanics usually 
obtained with the help of the wave function. For instance, the 
statistical matrix describing a stationary state where A has a 
sharp value a, belonging to the eigenfunction *jj(a,q), is 

p^t(a,qW*(a,q'). (9.16) 

Then, from the definition (9.12) it follows easily that for this p 
and any real operator A one has 

Ap = P A = ap, (9.17) 

hence for q = q', with (9.14), 

n(a,q)= |0(a,g)| 2 , I = a. (9.18) 

Thus we have obtained the usual assumption that | </>(#, #)| 2 
is the 'probability' (if normalized to 1) or 'number density* 
(if normalized to N) at the point q for the state a. (It must, 
however, be noted that for systems of numerous particles, like 
liquids in motion, other ways of averaging are useful, for instance 
for the square of a momentum instead of np* = ^(pp 2 +p 2 p) Q ^^ 
the expression ^ipp^+p^p+Zppp)^?, which, however, for uni- 
form conditions coincides with the former.) 

Let us consider the general stationary case where p is inde- 
pendent of time and therefore satisfies 

[#,p] = 0. (9.19) 

Any solution of this equation, i.e. any quantity A which com- 
mutes with H 9 is called an integral of the motion, in analogy to 
the corresponding classical conception. H itself is, of course, an 
integral. All integrals A a , A 2 ,..., have different eigenvalues 
A x , A 2 ,..., for one and the same eigenfunction ^(Aj, A 2 ,...; #i>3 2 ,...) 
or shortly ^(A, q) : 

A 1 = A 1 0, A 2 ^ = A 2 ^, .... (9.20) 

p can be taken as any function of the A's; its matrix representa- 
tion is given by 

P(9, 9') = 


from which one obtains, with (9.18), 

n(q) = />(?,?) = I P(A)|</r(A,g)|* = J P(A)n(A,g). (9.22) 

A A 

This shows that the arbitrary coefficient P(A) is the probability! 
of finding the system in the stationary state A. 

Dynamical problems arise in a somewhat different way from 
those in classical theory. There it has a definite meaning to 
speak about the motion of particles in a closed system, for 
instance of the orbit of Jupiter in the planetary system. In 
quantum theory a closed system settles down in a definite 
stationary state, or a mixture of such states as given by (9.21). 
But then nothing is changing in time; one cannot even make an 
observation without interfering with the state of the system. 
In classical physics it is supposed that we have to do with an 
objective and always observable situation; the process of measur- 
ing is assumed to have no influence on the object of observation. 
I have, however, drawn your attention to the point that even 
in classical physics this postulate is practically never fulfilled 
because of the Brownian motion which affects the instruments. 
We are therefore quite prepared to find that the assumption of 
'harmless' observations is impossible. 

The most general way of formulating a dynamical problem is 
to split the Hamiltonian in two parts 

H = H +V, (9.23) 

where jff describes what is of interest while V is of minor impor- 
tance, a so-called perturbation. F may also include external 
influences and depend explicitly on the time. This partition is, 
of course, arbitrary to a high degree; but it corresponds to the 
actual situation. If a water molecule H 2 O is assembled from its 
atoms, one can either ask what the stationary states of the whole 
system are, or one can consider the parts H 2 and O and ask how 
the states of the hydrogen molecule H 2 are changed by the 
approaching oxygen atom, or one can ask the same question 
for the HO radical and the H atom. The latter two are dynamical 

Dynamical problems in quantum theory therefore, in contrast 
to those in classical theory, cannot be defined withouta subjective, 


more or less arbitrary decision about what you are interested 
in. In other words, quantum mechanics does not describe an 
objective state in an independent external world, but the aspect 
of this world gained by considering it from a certain subjective 
standpoint, or with certain experimental means and arrange- 
ments. This statement has produced much controversy, and 
though it is generally accepted by the present generation of 
physicists it has been decidedly rejected by just those two men 
who have done more for the creation of quantum physics than 
anybody else, Planck and Einstein. Yet, with all respect, I 
cannot agree with them. In fact, the assumption of absolute 
observability which is the root of the classical concepts seems to 
me only to exist in imagination, as a postulate which cannot be 
satisfied in reality. 

Assuming the partition (9.23) one has to describe the system in 
terms of the integrals of motion A l5 A 2 ,... of H Q which are, how- 
ever, not integrals of motion of H. All operators are then to be 
expressed as matrices in the eigenvalues A (A 1? A 2 ,...) of A x , A 2 ,...; 
for instance, the statistical operator p by the matrix p(t; A, A'). 
The diagonal elements of this matrix 

P(*; A) = />(*; A, A) (9.24) 

represent the probability of a state A at time , and they go over 
for t = into the coefficients P(A) which appear in the expansion 
(9.21) and represent the initial probabilities. The function 
p(t\ A, A') can be determined from the differential equation (9.10) 
by a method of successive approximations. My collaborator, 
Green, has even found an elegant formula representing the com- 
plete solution. To a second approximation one finds 

P(t,X) = P(A)+2 -/(A,A'){P(A')-P(A)}+...; (9.25) 


the coefficients are given by 

J(A,A') = 


fF(; A, * 

where E is the energy of the unperturbed s^en^in the state A, 
E' that in the state A' (see Appendix, 30). 


Now equation (9.25) has precisely the form of the laws of radio- 
active decay, or of a set of competing mono-molecular reactions. 
The matrix J(A,A') obviously represents the probability of a 
transition or jump from the state A to the state A'. This inter- 
pretation becomes still more evident if one assumes that the 
A- values are practically continuous, as would be the case if the 
system allowed particles to fly freely about (for instance in radio- 
activity one has to take account of the emitted a-particles; in the 
theory of optical properties of an atom of the photons emitted 
and absorbed). If external influences are excluded, so that F 
does not depend on time, the integral (9.26) can be worked out 
with the result that J becomes proportional to the time 

J(A,A')=j(A,A% (9.27) 

where j(X 9 A') = ^ \V(X,\')\*S(E-E'). (9.28) 


The last factor 8(EE') says that j(A, A') differs from zero for 
two states A and A' only if their energy is equal. /(A, A') is 
obviously the transition probability per unit time, precisely 
the quantity used in radioactivity. 

By applying the formula (9.25) to the case of the interaction 
of an atom with an electromagnetic field one obtains the formula 
(8.24) which was used by Einstein in his derivation of Planck's 
radiation law. There are innumerable similar applications, such 
as the calculation of the effective cross-sections of various kinds 
of collision processes, which have provided ample confirmation 
of the formula (9.25). 


There is no doubt that the formalism of quantum mechanics 
and its statistical interpretation are extremely successful in 
ordering and predicting physical experiences. But can our 
3esire o. understanding, our wish to explain things, be satisfied 
by a theorj^^^hich is frankly and shamelessly statistical and 
indeterministic ran we be content with accepting chance, not 
cause, as the supreme law of the physical world ? 

To this last question I answer that not causality, properly 
understood, is eliminated, but only a traditional interpretation 


of it, consisting in its identification with determinism. I have 
taken pains to show that these two concepts are not identical. 
Causality in my definition is the postulate that one physical 
situation depends on the other, and causal research means the 
discovery of such dependence. This is still true in quantum 
physics, though the objects of observation for which a depen- 
dence is claimed are different: they are the probabilities of 
elementary events, not those single events themselves. 

In fact, the statistical matrix p, from which these probabilities 
are derived, satisfied a differential equation which is essentially 
of the same type as the classical field equations for elastic or 
electromagnetic waves. For instance, if one multiplies the 
eigenfunction i/j(q) of the Hamiltonian H, Hift = EI/J, by e iEtln , 
the new function satisfies 

-?|==^- (9 ' 29) 

For a free particle, where H = -L (pl+p$+pl) = J^- A, (9.29) 
goes over into the wave equation 

A* (9-30) 

Although here only the first derivative with respect to time 
appears, it does not differ essentially from the ordinary wave 

/ 1 r)2JL\ 

equation j where the left-hand side is ~ ^ \ . One must remember 

y C ut J 

that only <fxf>* = |<| 2 has a physical meaning (as a probability), 
where ^* satisfies the conjugate complex equation 

_ ^ 

H dt 

For this pair of equations a change in the time direction (t -> t) 
can be compensated by exchanging < and <*, which has no in- 
fluence on <<*. 

The same holds in the general case (9.29), and we see that the 
jtjiff erential equations of the wave function share the property 
of all classical field equations that the principle of antecedence 


is violated: there is no distinction between past and future for 
the spreading of the probability density. On the other hand, the 
principle of contiguity is obviously satisfied. 

The differential equation itself is constructed in a way very 
similar to the classical equations of motion. It contains in the 
potential energy, which is part of the Hamiltonian, the classical 
idea of force, or in other words, the Newtonian quantitative 
expression for causation. If, for instance, particles are acting 
on one another with a Coulomb force (as the nucleus and the 
electrons in an atom), there appears in H the same timeless action 
over finite distance as in Newtonian mechanics. Yet one has the 
feeling that these vestiges of classical causality are provisional 
and will be replaced in a future theory by something more satis- 
factory; in fact, the difficulties which the application of quantum 
mechanics to elementary particles encounters are connected with 
the interaction terms in the Hamiltonian; they are obviously still 
too * classical'. But these questions are outside the scope of my 

We have the paradoxical situation that observable events obey 
laws of chance, but that the probability for these events itself 
spreads according to laws which are in all essential features 
causal laws. 

Here the question of reality cannot be avoided. What really 
are those particles which, as it is often said, can just as well 
appear as waves ? It would lead me far from my subject to discuss 
this very difficult problem. I think that the concept of reality is 
too much connected with emotions to allow a generally accept- 
able definition. For most people the real things are those things 
which are important for them. The reality of an artist or a poet 
is not comparable with that of a saint or prophet, nor with that 
of a business man or administrator, nor with that of the natural 
philosopher or scientist. So let me cling to the latter kind of 
special reality, which can be described in fairly precise terms. 
It presupposes that our sense impressions are not a permanent 
hallucination, but the indications of, or signals from, an external 
world which exists independently of us. Although these signals 
change and move in a most bewildering way, we are aware o 


objects with invariant properties. The set of these invariants of 
our sense impressions is the physical reality which our mind 
constructs in a perfectly unconscious way. This chair here looks 
different with each movement of my head, each twinkle of my 
eye, yet I perceive it as the same chair. Science is nothing else 
than the endeavour to construct these invariants where they are 
not obvious. If you are not a trained scientist and look through 
a microscope you see nothing other than specks of light and 
colour, not objects; you have to apply the technique of biological 
science, consisting in altering conditions, observing correlations, 
etc., to learn that what you see is a tissue with cancer cells, or 
something like that. The words denoting things are applied to 
permanent features of observation or observational invariants. 

In physics this method has been made precise by using mathe- 
matics. There the invariant against transformation is an exact 
notion. Felix Klein in his celebrated JErlanger Programm has 
classified the whole of mathematics according to this idea, and 
the same could be done for physics. 

From this standpoint I maintain that the particles are real, 
as they represent invariants of observation. We believe in the 
'existence' of the electron because it has a definite charge e and a 
definite mass m and a definite spin s; that means in whatever 
circumstances and experimental conditions you observe an effect 
which theory ascribes to the presence of electrons you find for 
these quantities, e, m, s, the same numerical values. 

Whether you can now, on account of these results, imagine 
the electron like a tiny grain of sand, having a definite position in 
space, that is another matter. In fact you can, even in quantum 
theory. What you cannot do is to suppose it also to have a 
definite velocity at the same time; that is impossible according to 
the uncertainty relation. Though in our everyday experience 
we can ascribe to ordinary bodies definite positions and velocities, 
there is no reason to assume the same for dimensions which are 
below the limits of everyday experience. 

Position and velocity are not invariants of observation. But 
they are attributes of the idea of a particle, and we must use 
them as soon as we have made up our minds to describe certain 


phenomena in terms of particles. Bohr has stressed the point that 
our language is adapted to our intuitional concepts. We cannot 
avoid using these even where they fail to have all the properties 
of ordinary experience. Though an electron does not behave like 
a grain of sand in every respect, it has enough invariant pro- 
perties to be regarded as just as real. 

The fact expressed by the uncertainty relation was first dis- 
covered by interpreting the formalism of the theory. An 
explanation appealing to intuition was given afterwards, namely 
that the laws of nature themselves prohibit the measurement 
with infinite accuracy because of the atomic structure of matter: 
the most delicate instruments of observation are atoms or 
photons or electrons, hence of the same order of magnitude as 
the objects observed. Niels Bohr has applied this idea with great 
success to illustrate the restrictions on simultaneous measure- 
ments of quantities subject to an uncertainty rule, which he calls 
'complementary' quantities. 

One can describe one and the same experimental situation 
about particles either in terms of accurate positions or in terms 
of accurate momenta, but not both at the same time. The two 
descriptions are complementary for a complete intuitive under- 
standing. You find these things explained in many text-books 
so that I need not dwell upon them. 

The adjective complementary is sometimes also applied to 
the particle aspect and the wave aspect of phenomena I think 
quite wrongly. One can call these 'dual aspects' and speak of a 
'duality* of description, but there is nothing complementary as 
both pictures are necessary for every real quantum phenomenon. 
Only in limiting cases is an interpretation using particles alone 
or waves alone possible. The particle case is that of classical 
mechanics and is applicable only to the case of large masses, 
e.g. to the centre of mass of an almost closed system. The wave 
case is that of very large numbers of independent particles, as 
illustrated by ordinary optics. 

The question of whether the waves are something 'real' or a 
fiction to describe and predict phenomena in a convenient way 
is a matter of taste. I personally like to regard a probability 



wave, even in S^-dimensional space, as a real thing, certainly 
as more than a tool for mathematical calculations. For it has 
the character of an invariant of observation; that means it 
predicts the results of counting experiments, and we expect to 
find the same average numbers, the same mean deviations, etc., 
if we actually perform the experiment many times under the 
same experimental condition. Quite generally, how could we 
rely on probability predictions if by this notion we do not refer 
to something real and objective ? This consideration applies just 
as much to the classical distribution function f(t\ p,q) as to the 
quantum-mechanical density matrix p(t; q,q'). 

The difference between /and p lies only in the law of propaga- 
tion, a difference which can be described as analogous to that 
between geometrical optics and undulatory optics. In the latter 
case there is the possibility of interference. The eigenfunctions 
of quantum mechanics can be superposed like light waves and 
produce what is often called 'interference of probability'. 

/^rrrj"__ _ 

FIG. 2. 

This leads sometimes to puzzling situations if one tries to ex- 
press the observations only in terms of particles. Simple optical 
experiments can be used as examples. Assume a source A of 
light illuminating a screen B with two slits B l9 B% and the 
light penetrating these observed on a parallel screen C. If only 
one of the slits B is open, one sees a diffraction pattern around 
the point where the straight line AB hits the screen, with a 
bright central maximum surrounded by small fringes. When 
both slits are open and the central maxima of the diffraction 
pattern overlap, there appear in this region new interference 
fringes, depending on the distance of the two slits. 


The intensity, i.e. the probability of finding photons on the 
screen, in the case of both slits open, is therefore not a simple 
superposition of those obtained when only one of the slits is 
open. This is at once understandable if you use the picture of 
probability waves determining the appearance of photons. For 
the spreading of the waves depends on the whole arrangement, 
and there is no miracle in the effect of shutting one slit. Yet if 
you try to use the particles alone you get into trouble; for then a 
particle must have passed one slit or the other and it is perfectly 
mysterious how a slit at a finite distance can have an influence on 
the diffraction pattern. Reichenbach, who has published a very 
thorough book on the philosophical foundations of quantum 
mechanics, speaks in such cases of 'causal anomalies'. To avoid 
the perplexity produced by them he distinguishes between 
phenomena, i.e. things really observable, such as the appearance 
of the photons on the screen, and 'inter-phenomena 3 , i.e. theoreti- 
cal constructions about what has happened to a photon on its 
way, whether it has passed through one slit or the other. He 
states rightly that the difficulties arise only from discussing 
inter-phenomena. 'That a photon has passed through the slit 
B t is meaningless as a statement of a physical fact.' If we want 
to make it a physical fact we have to change the arrangement 
in such a way that the passing of a photon through the slit B l can 
be really registered; but then it would not fly on undisturbed, 
and the phenomenon on the screen would be changed. Reichen- 
bach's whole book is devoted to the discussion of this type of 
difficulty. I agree with many of his discussions, though I object 
to others. For instance, he treats the interference phenomenon 
of two slits also in what he calls the wave interpretation; but here 
he seems to me to have misunderstood the optical question. In 
order to formulate the permitted and forbidden (or meaningless) 
statements he suggests the use of a three-valued logic, where the 
law of the 'excluded middle' (tertium non datur) does not hold. 
I have the feeling that this goes too far. The problem is not one 
of logic or logistic but of common sense. For the mathematical 
theory, which is perfectly capable of accounting for the actual 
observations, makes use only of ordinary two-valued logics. 


Difficulties arise solely if one transcends actual observations and 
insists on using a special restricted range of intuitive images 
and corresponding terms. Most physicists prefer to adapt their 
imagination to the observations. Concerning the logical pro- 
blem itself, I had the impression when reading Reichenbach's 
book that in explaining three- valued logic he constantly used 
ordinary logic. This may be avoidable or justifiable. I remember 
the days when I was in daily contact with Hilbert, who was 
working on the logical foundations of mathematics. He dis- 
tinguished two stages of logics: intuitive logic dealing with finite 
sets of statements, and formal logic (logistics), which he described 
as a game with meaningless symbols invented to deal with the 
infinite sets of mathematics avoiding contradictions (like that 
revealed in Russell's paradox). But G6del showed that these 
contradictions crop up again, and Hilbert's attempt is to-day 
generally considered a failure. I presume that three-valued 
logic is another example of such a game with symbols. It is 
certainly entertaining, but I doubt that natural philosophy will 
gain much by playing it. 

Thinking in terms of quantum theory needs some effort and 
considerable practice. The clue is the point which I have 
stressed above, that quantum mechanics does not describe a 
situation in an objective external world, but a definite experimen- 
tal arrangement for observing a section of the external world. 
Without this idea even the formulation of a dynamical problem 
in quantum theory is impossible. But if it is accepted, the funda- 
mental indeterminacy in physical predictions becomes natural, 
as no experimental arrangement can ever be absolutely precise. 

I think that even the most fervent determinist cannot deny 
that present quantum mechanics has served us well in actual 
research. Yet he may still hope that one day it will be replaced 
by a deterministic theory of the classical type. 

Allow me to discuss briefly what the chances of such a counter- 
revolution are, and how I expect physics to develop in future. 

It would be silly and arrogant to deny any possibility of a 
return to determinism. For no physical theory is final; new 
experiences may force us to alterations and even reversions. Yet 


scanning the history of physics in the way we have done we see 
fluctuations and vacillations, but hardly a reversion to more 
primitive conceptions. I expect that our present theory will be 
profoundly modified. For it is full of difficulties which I have not 
mentioned at all the self-energies of particles in interaction 
and many other quantities, like collision cross-sections, lead to 
divergent integrals. But I should never expect that these 
difficulties could be solved by a return to classical concepts. I 
expect just the opposite, that we shall have to sacrifice some 
current ideas and use still more abstract methods. However, 
these are only opinions. A more concrete contribution to this 
question has been made by J. v. Neumann in his brilliant book, 
Mathematische Orundlagen der Quantenmechanik. He puts the 
theory on an axiomatic basis by deriving it from a few postulates 
of a very plausible and general character, about the properties 
of 'expectation values' (averages) and their representation by 
mathematical symbols. The result is that the formalism of 
quantum mechanics is uniquely determined by these axioms; in 
particular, no concealed parameters can be introduced with the 
help of which the indeterministic description could be trans- 
formed into a deterministic one. Hence if a future theory 
should be deterministic, it cannot be a modification of the present 
one but must be essentially different. How this should be 
possible without sacrificing a whole treasure of well-established 
results I leave to the determinists to worry about. 

I for my part do not believe in the possibility of such a turn 
of things. Though I am very much aware of the shortcomings 
of quantum mechanics, I think that its indeterministic founda- 
tions will be permanent, and this is what interests us from the 
standpoint of these lectures on cause and chance. There remains 
now only to show how the ordinary, apparently deterministic 
laws of physics can be obtained from these foundations. 

The main problem of the classical kinetic theory of matter 
was how to reconcile the reversibility of the mechanical motion 
of the ultimate particles with the irreversibility of the thenno- 


dynamical laws of matter in bulk. This was achieved by pro- 
claiming a distinction between the true laws which are strictly 
deterministic and reversible but of no use for us poor mortals 
with our restricted means of observation and experimentation, 
and the apparent laws which are the result of our ignorance and 
obtained by a deliberate act of averaging, a kind of fraud or 
falsification from the rigorous standpoint of determinism. 

Quantum theory can appear with a cleaner conscience. It 
has no deterministic bias and is statistical throughout. It has 
accepted partial ignorance already on a lower level and need 
not doctor the final laws. 

In order to define a dynamical phenomenon one has, as we have 
seen, to split the system in two parts, one being the interesting 
one, the other a 'perturbation'; and this separation is highly 
arbitrary and adaptable to the experimental arrangement to be 
described. Now this circumstance can be exploited for the pro- 
blem of thermodynamics. There one considers two (or more) 
bodies first separated and in equilibrium, then brought into 
contact and left to themselves until equilibrium is again 

Let HM be the Hamiltonian of the first body, #< 2 > that of the 
second, and write 


Then this is the combined Hamiltonian of the separated bodies. 
If they are brought into contact the Hamiltonian will be differ- 
ent, namely ff = ^^ (9 32) 

where F is the interaction, which for ordinary matter in bulk will 
consist of surface forces. Now (9.32) has exactly the form of the 
Hamiltonian of the fundamental dynamical problem, if we are 
'interested* in J5T : and that is just the case. 

Hence we describe the behaviour of the combined system by 
the proper variables of the unperturbed system, i.e. by the inte- 
grals of motion A\ Aj^,..., of the first body, and the integrals 
of motion Ai 2) , Ajj 2) ,..., of the second body, which all together 
form the integrals of motion of the separated bodies, represented 


by H Q . Hence we can use the solution of the dynamical problem 
given before, namely (9.25), 

P(*,A) = P(A)+ I J(A, A'){P(A')-P(A)}+..., (9.33) 


where now A represents the sets of eigenvalues A (1) = (A^, A^,...) 
of Ai, A^>,..., and X = (A< 2 >, Af>,...) of A< 2 >, A 2 >,.... 
Let us consider first statistical equilibrium. Then 

hence the sum must vanish, and one must have 

P(A') = P(A) (9.34) 

for any two states A, A' for which the transition probability 
J(A,A') is not zero. But we have seen further that these quan- 
tities /(A,A') are in all practical cases proportional to the time 
and vanish unless the energy is conserved, E = E r (see formulae 
9.27, 9.28). If we disregard cases where other constants of motion 
exist for which a conservation law holds (like angular momentum 
for systems free to rotate), one can replace P(A) by P(E). But 
as the total system consists of two parts which are practically 
independent, one has 

P(E) = P(A) = Pi(Aa>)P a (A>), (9.35) 

where the two factors represent the probabilities of finding the 
separated parts initially in the states A (1) and A (2) . This factoriza- 
tion need not be taken from the axioms of the calculus of pro- 
bability; it is a consequence of quantum mechanics itself. For 
if the energy is a sum of the form (9.31), the exact solution of the 
fundamental equation for the density operator 

| = [#,/0 (9.36) 

is p = p 1 pfr where p l refers to the first system H (l \ /> 2 to the 
second # (2) ; as according to (9.24) P(t, A) = p(t\ A, A), the product 
formula (9.35) holds not only for the stationary case (as long as 
the interactions can be neglected). If now J (1 >(A (1) ) and J5 (2) (A (2) ) 
are the energies of the separated parts, one obtains from (9.35) 



which is a functional equation for the three functions P, P l9 P 2 . 
The solution is easily found to be (see Appendix, 31) 

p = e *-ps 9 p i = e *i-psi 9 p 2 = e **-ps* 9 (9.38) 

with a = Oi+aj, J0 = .#!+ J 2 (9.39) 

and the same j8 in all three expressions. 

Thus we have found again the canonical distribution of Gibbs, 
with the modification that the energies appearing are not explicit 
functions of q and p (Hamiltonians) but of the eigenvalues 
A, A (1) , A (2) of the integrals of motion. 

This derivation is obviously a direct descendant of Maxwell's 
first proof of his velocity distribution law which we discussed 
previously, p. 51. But while the argument of independence is not 
justifiable with regard to the three components of velocity, it is 
perfectly legitimate for the constants of motion A. The fact 
that the multiplication law of probabilities and the additivity 
of energies for independent systems leads to the exponential 
distribution law has, of course, been noticed and used by many 
authors, beginning with Gibbs himself. This reasoning becomes, 
with the help of quantum mechanics, an exact proof which shows 
the limits of validity of the results. For if there exist constants 
of motion other than the energy, the distribution law has to 
be modified, and therefore the whole of thermodynamics. This 
happens for instance for bodies moving freely in space, like stars, 
where the quantity /J = 1/kT is no longer a scalar but the time 
component of a relativistic four- vector, the other components 
representing j8v, where v is the mean velocity of the body. Yet 
this is outside the scope of these lectures. 

The simplest and much discussed application of quantum 
statistics is that to the ideal gases. It was Einstein who first 
noticed that for very low temperatures deviations from the 
classical laws should appear. The Indian physicist, Bose, had 
shown that one can obtain Planck's law of radiation by regard- 
ing the radiation as a 'photon gas* provided one did not treat 
the photons as individual recognizable particles but as com- 
pletely indistinguishable. Einstein transferred this idea to 
material atoms. Later it was recognized that this so-called Bose- 


Einstein statistics was a straightforward consequence of quan- 
tum mechanics; about the same time Fermi and Dirac discovered 
another similar case which applies to electrons and other particles 
with spin. 

In the language used here the two 'statistics' can be simply 
characterized by the symmetry of the density function 

p(x 1 , X 2 ,..., x^; x lf X 2 ,..., Xjy). 

It is always symmetric, for indistinguishable particles, in both 
sets of arguments, i.e. it remains unchanged if both sets are 
subject to the same permutation. If, however, only one set is 
permuted, p remains also unchanged in the Bose-Einstein case 
for all permutations, while in the Fermi-Dirac case it does so 
only for even permutations, and changes sign for odd permuta- 

Applied to a system of free particles of equal structure, one 
obtains at once from the canonical distribution law the pro- 
perties of so-called degenerate gases. But as these are treated in 
many text-books, I shall not discuss them here (see Appendix, 

After having considered statistical equilibrium we have now 
to ask whether quantum mechanics accounts for the fact that 
every system approaches equilibrium in time by the dissipation 
of visible energy into heat, or, in other words, whether the 
H- theorem of Boltzmann holds. 

This is the case indeed, and not difficult to prove. One defines 
the total entropy, just as in classical theory, by 


where the summations are to be extended over all values of the 
Ai,A 2 ,...; i.e. for each separate part of a coupled system over 
XL\ A 2 X) ,... and AJ 2 *, A 2 2) ,..., respectively, and for the whole system 
over both sets. For loosely coupled systems the probabilities 
are, as we have seen, multiplicative at any time : 

P(t\ A>, A< 2 >) = PM X)P 2 (t, A< 2 >). (9.41) 



From this it follows easily that the entropies are additive, 

S = S^S^. (9.42) 

Now substitute into (9.40) the explicit expression for P(t,X) 
from (9.33) which holds for weak coupling; then by neglecting 
higher powers of the small quantities J(A, A') one obtains 


where Q(X,X) - {P(A)-P(A')}log^. (9.44) 

-r(A ) 

The transition probabilities J(A, A') are, as we have seen, in all 
practical cases proportional to time and vanish for transitions, 
for which energy is not conserved; one has, according to (9.27) 
and (9.28), 

J(A, A') = t |7(A, X')\^(E-E'), (9.45) 

where V is the interaction potential. These quantities J(A, A') 

are always positive. So is the denominator ]T P(A), while Q(X, A') 

is positive as long as P(A) differs from P(A'). 

Hence S increases with time and will continue to do so, until 
statistical equilibrium is reached. For only then no further 
increase of 8 will happen, as is seen by taking equilibrium as the 
initial state (where according to (9.34) Q(A,A') = for all non- 
vanishing transitions). 

It remains now to investigate whether quantum kinetics 
leads, for matter in bulk, to the ordinary laws of motion and ther- 
mal conduction as formulated by Cauchy. This is indeed the 
case, as far as these laws are expressed in terms of stress, energy, 
and flux of matter and heat. Yet, as we have seen, this is only 
half the story, since Cauchy's equations are rather void of mean- 
ing as long as the dependence of these quantities on strain, 
temperature, and the rate of their changes in space and time are 
not given. Now in these latter relations the difference between 
quantum theory and classical theory appears and can reach vast 
proportions under favourable circumstances, chiefly for low 


temperatures. The theory sketched in the following is mainly 
due to my collaborator, Green. 

The formal method of obtaining the hydrothermal equations 
is very similar to that used in classical theory. Starting from the 
fundamental equation for N particles 

8 -^ = [H N ,p N ], (9.46) 

a reduction process is applied to obtain similar equations for 
Nl,N2,..., particles, until the laws of motion for one particle 
are reached. 

The reduction consists, as in classical theory, in averaging 
over one, say the last, particle of a set. The coordinates of each 
particle appear twice as arguments of a matrix 

Pn = 

Put here x (n) = x (w) ' and integrate over x (n) ; the result is x n Pn> a 
matrix which depends only on x (1) ,..., x (n ~ l) i x (l) ',.--> x^" 1 *'. 
With the same normalization as in classical theory, (6.40), p. 67, 

\V6 \Vilue 

By applying this operation several times to (9.46) one obtains 
(see Appendix, 33) 

= [B q , P Q ]+S q (q = 1, 2,..., tf ), (9.48) 

where S q = ^ Xq+ i [<&* +1) , /> fl+ i], (9.49) 

in full analogy with the corresponding classical equations (6.44), 
(6.45). Here H Q means the Hamiltonian of q particles, O (i +1) 
the interaction between one of these (i) and a further particle 
(q+1), and$ 9 is called, as before, the statistical term. 

The quantity p fi (x, x) = ^(x) represents the generalized 
number density for a 'cluster* of q particles, and in particular 
n^x) is the ordinary number density. 

Now one can obtain generalized hydro-thermodynamical 
equations from (9.47) by a similar process to that employed in 


classical theory. Instead of integrating over the velocities one 
has to take the diagonal terms of the matrices (putting x = x'), 
and one has to take some precautions in regard to non-commuta- 
tivity by symmetrizing products, e.g. replacing a/J by J(aj8+j8a) 
(see Appendix, 33). Exactly as in the classical equations of 
motion there appears the average kinetic energy of the particle 
(i) in a cluster of q particles which, divided by |&, may be called 
a kinetic temperature of the particle (i) in the cluster of q par- 
ticles. One might expect that the quantity T : corresponds to the 
ordinary temperature; but this is not the case. 

It is well known from simple examples (e.g. the harmonic 
oscillator) that in quantum theory for statistical equilibrium the 
thermodynamic temperature T 9 defined as the integrating de- 
nominator of entropy, is not equal to the mean square momen- 
tum. Here in the case of non -equilibrium it turns out that not 
only this happens, but that a similar deviation occurs with regard 
to pressure. The thermodynamical pressure p is defined as the 
work done by compression for unit change of volume; the kinetic 
pressure p l is the isotropic part of the stress tensor in the equations 
of motion. These two quantities differ in quantum theory. 

Observable effects produced by this difference occur only for 
extremely low temperatures. For gases these are so low that 
they cannot be reached at all because condensation takes place 
long before. Most substances are solid crystals in this region of 
temperature; for these one has a relatively simple quantum 
theory, initiated by Einstein, where the vibrating lattice is 
regarded as equivalent to a set of oscillators (the 'normal modes'). 
This theory represents the quantum effects in equilibrium 
(specific heat, thermal expansion) fairly well down to zero 
temperature, while the phenomena of flow are practically 

There are only two cases where quantum phenomena of flow 
at very low temperatures are conspicuous. One is liquid helium 
which, owing to its small mass and weak cohesion, does not 
crystallize under normal pressure even for the lowest tempera- 
tures and becomes supra-fluid at about 2 absolute. The other 
case is that of the electrons in metals which, though not an 


ordinary fluid, behave in many respects like one and, owing to 
their tiny mass, exhibit quantum properties, the strangest of 
which is supra-conductivity. 

In order to confirm the principles of quantum statistics, 
investigations of these two cases are of great interest. Both 
have been studied theoretically in my department in Edinburgh, 
and I wish to say a few words about our results. 

In the supra-fluid state helium behaves very differently from a 
normal liquid. It appears to lose its viscosity almost completely; 
it flows through capillaries or narrow slits with a fixed velocity 
almost independent of the pressure, creeps along the walls of the 
container, and so on. A metal in the supra-conductive state has, as 
the name says, no measurable electrical resistance and behaves 
abnormally in other ways. A common and very conspicuous 
feature of both phenomena is the sharpness of the transition 
point which is accompanied by an anomaly of the specific 
heat: it rises steeply if the temperature approaches the critical 
value T c from below, and drops suddenly for T = T C9 so that the 
graph looks like the Greek letter A; hence the expression A-point 
for T c . However, this similarity cannot be very deeply rooted. 
Where has one to expect, from the theoretical standpoint, the 
beginning of quantum phenomena ? Evidently when the momen- 
tum p of the particles and some characteristic length I are reaching 
the limit stated by the uncertainty principle, pi ~ H. If we 
equate the kinetic energy p 2 /2m to the thermal energy JcT, 
the critical temperature will be given by lcT c ~ # 2 /2ml 2 . If one 
substitutes here for k and H the well-known numerical values and 
for m the mass of a hydrogen atom times the atomic mass 
number /*, one finds, in degrees absolute, 


T C~-Jt> ( 9 ' 5 ) 

where I is measured in Angstr&m units (10"" 8 cm.). 

For a helium atom one has //, = 4, and if lis the mean distance 
of two atoms (order 1 A) one obtains for T c a few degrees, which 
agrees with the observed transition at about 2. But for elec- 
trons in metals one has \L = 1/1840 . If one now assumes one 


electron per atom and interprets I as their mean distance it would 
be again of the order 1 A, hence the expression (9.50) would 
become some thousand degrees and has therefore nothing to do 
with the A-point of supra-conductivity. This temperature has, 
in fact, another meaning; it is the so-called 'degeneration tem- 
perature' T g of the electronic fluid; below T g , for instance at 
ordinary temperatures, there are already strong deviations from 
classical behaviour (e.g. the extremely small contribution of the 
electrons to the specific heat), though not of the extreme charac- 
ter of supra-conductivity. In order to explain the A-point of 
supra-conductivity which lies for all metals at a few degrees 
absolute, one has to take I about 200 times larger (^ 200 A). 
As the interpretation of this length is still controversial, I shall 
not discuss supra-conductivity any further (see Appendix, 34). 

Nor do I intend in the case of supra-fluidity of helium to give 
a full explanation of the A-discontinuity, but I wish to direct 
your attention to the thermo-mechanical properties of the 
supra-liquid below the A-point, called He II. 

I have already mentioned that in quantum liquids one has to 
distinguish the ordinary thermodynamic temperature T and 
pressure p from the kinetic temperature T l and pressure p v 
The hydro-thermal equations contain only T and p v and these 
quantities are constant in equilibrium, i.e. for a state where no 
change in time takes place. But 2\ and p l are not simple 
functions of T and p but depend also on the velocity and its 
gradient. Therefore in such a state permanent currents of mass 
and of energy may flow as if no viscosity existed. This is reflected in 
the energy balance which can be derived from the hydro-thermal 
equations. One obtains a curious result which looks like a viola- 
tion of the first law of thermodynamics; for the change of heat 
is given by 

dQ = TdS = dU+pdV-Vdn, (9.51) 

where all symbols have the usual meaning, and TT = pp is 
the difference of the kinetic and thermodynamic pressures. This 
equation differs from the ordinary thermodynamical expression 
(5.12) by the term V drr; howis this possible if thermodynamics 


claims rightly universal validity? This claim is quite legitimate, 
but the usual form of the expression for dQ depends on the 
assumption that a quasi-static, i.e. very slow, process can be 
regarded as a sequence of equilibria each determined by the 
instantaneous values of pressure and volume. This is correct in 
the classical domain, because if the rate of change of external 
action (compression, heat supply, etc.) is slowed down, all veloci- 
ties in the fluid tend to disappear. Not so in quantum mechanics. 
In consequence of the indeterminacy condition the momenta or 
the velocities cannot decrease indefinitely if the coordinates of 
the particles are restricted to very small regions. An investiga- 
tion of the hydro-thermal equations shows that this effect is 
preserved, to some degree, even for the visible velocities; it is 
true there can exist a genuine statistical equilibrium where the 
density is uniform and the currents of mass and energy vanish, 
but there are also those states possible where certain combina- 
tions of currents of mass (velocities) and of energy (heat) per- 
manently exist. The production of these depends entirely on 
the way in which the heat dQ is supplied to the system and cannot 
be suppressed by just making the rate of change of volume very 
small. We have therefore not a breakdown of the law of con- 
servation of energy but of its traditional thermodynamical 

The consequences of that extra term in (9.51) are easily seen 
by introducing instead of the internal energy the quantity 

E = UTrV (9.52) 

in the expression (9.51) for dQ, which then reads 

dQ = dX+ptdV, (9.53) 

where p = P-\~TT is the kinetic pressure. This shows that the 
specific heat at constant volume is 

(9 54) 

P (9>54) 

not (dU/dT)^ as in classical thermodynamics. Now as p v and 
therefore TT = Pip, is very large at T = and decreases with 
increasing T to reach the value at the A-point, one obtains for 


c v (T) a curve exactly of the form actually observed. Hence the 
A-anomaly is due to the coupling of heat currents with the 
mass motion characteristic of quantum liquids. It is a molar, 
macroscopic motion, the shape of which depends on the geo- 
metrical conditions, presumably consisting of tiny closed threads 
of fast-moving liquid, or groups of density waves. 

A similar conception has been derived by several authors 
(Tisza, Mendelssohn, Landau) from the experiments; they speak 
of the liquid being a mixture of ordinary atoms and special 
degenerate atoms (z-particles) which are in the lowest quantum 
state and carry neither energy nor entropy. Yet in a liquid one 
cannot attribute a quantum state to single atoms. 

These considerations are also the clue to the understanding 
of other anomalous phenomena, as the flow through narrow 
capillaries or slits, the so-called fountain effect, the 'second 
sound', etc. Green has studied the properties of He II in detail 
and arrived at the conclusion that the quantum theory of liquids 
can account for the strange behaviour of this substance. 

I have dwelt on this special problem in some detail as it reveals 
in a striking way that quantum phenomena are not confined to 
atomic physics or microphysics where one aims at observing 
single particles, but appear also in molar physics which deals 
with matter in bulk. From the fundamental standpoint this 
distinction, so essential in classical physics, loses much of its 
meaning in quantum theory. The ultimate laws are statistical, 
and the deterministic form of the molar equations holds for 
certain averages which for large numbers of particles or quanta 
are all one wants to know. 

Now these molar laws satisfy all postulates of classical 
causality: they are deterministic and conform to the principles 
of contiguity and antecedence. 

With this statement the circle of our considerations about 
cause and chance in physics is closed. We have seen how classical 
physics struggled in vain to reconcile growing quantitative 
observations with preconceived ideas on causality, derived from 
everyday experience but raised to the level of metaphysical 
postulates, and how it fought a losing battle against the intrusion 


of chance. To-day the order of ideas has been reversed: chance 
has become the primary notion, mechanics an expression of its 
quantitative laws, and the overwhelming evidence of causality 
with all its attributes in the realm of ordinary experience is 
satisfactorily explained by the statistical laws of large numbers. 


THE statistical interpretation which I have presented in the last 
section is now generally accepted by physicists all over the world, 
with a few exceptions, amongst them a most remarkable one. 
As I have mentioned before, Einstein does not accept it, but still 
believes in and works on a return to a deterministic theory. To 
illustrate his opinion, let me quote passages from two letters. 
The first is dated 7 November 1944, and contains these lines: 

'In unserer wissenschaftlichen Erwartung haben wir uns zu Antipoden 
entwickelt. Du glaubst an den wiirfelnden Gott und ich an voile Gesetz- 
lichkeit in einer Welt von etwas objektiv Seiendem, das ich auf wild 
spekujativem Weg zu erhaschen suche. Ich hoffe, dass einer einen mehr 
realistischen Weg, bezw. eine mehr greifbare Unterlage fur eine solche 
Auffassung finden wird, als es mir gegeben ist. Der grosse anfangliche 
Erfolg der Quantentheorie kann mich doch nicht zum Glauben an das 
fundamentale Wiirfelspiel bringen. ' 

(In our scientific expectations we have progressed towards antipodes. 
You believe in the dice -playing god, and I in the perfect rule of law in a 
world of something objectively existing which I try to catch in a wildly 
speculative way. I hope that somebody will find a more realistic way, 
or a more tangible foundation for such a conception than that which is 
given to me. The great initial success of quantum theory cannot convert 
me to believe in that fundamental game of dice.) 

The second letter, which arrived just when I was writing these 
pages (dated 3 December 1947), contains this passage: 

*Meine physikalische Haltung kann ich Dir nicht so begrunden, dass 
Du sie irgendwie vernunftig finden wiirdest. Ich sehe naturlich ein, dass 
die principiell statistische Behandlungsweise, deren Notwendigkeit im 
Rahmen des bestehenden Formalismus ja zuerst von Dir klar erkannt 
wurde, einen bedeutenden Wahrheitsgehalt hat. Ich kann aber deshalb 
nicht ernsthaft daran glauben, weil die Theorie mit dem Grundsatz 
unvereinbar ist, dass die Physik eine Wirklichkeit in Zeit und Raum 
darstellen soil, ohne spukhafte Fernwirkungen. . . . Davon bin ich fest 
uberzeugt, dass man schliesslich bei einer Theorie landen wird, deren 
gesetzmassig verbundene Dinge nicht Wahrscheinlichkeiten, sondern 
gedachte Tatbestande sind, wie man es bis vor kurzem als selbstver- 
standlich betrachtet hat. Zur Begnindung dieser tJberzeugung kann 
ich pber nicht logische Griinde, sondern nur meinen kleinen Finger als 
Zeugen beibringen, also keine Autoritat, die ausserhalb meiner Haut 
irgendwelchen Respekt einfiossen kann. 1 


(I cannot substantiate my attitude to physics in such a manner that 
you would find it in any way rational. I see of course that the statistical 
interpretation (the necessity of which in the frame of the existing for- 
malism has been first clearly recognized by yourself) has a considerable 
content of truth. Yet I cannot seriously believe it because the theory 
is inconsistent with the principle that physics has to represent a reality 
in space and time without phantom actions over distances. ... I am 
absolutely convinced that one will eventually arrive at a theory in which 
the objects connected by laws are not probabilities, but conceived facts, 
as one took for granted only a short time ago. However, I cannot provide 
logical arguments for my conviction, but can only call on my little finger 
as a witness, which cannot claim any authority to be respected outside 
my own skin.) 

I have quoted these letters because I think that the opinion 
of the greatest living physicist, who has done more than anybody 
else to establish modern ideas, must not be by-passed. Einstein 
does not share the opinion held by most of us that there is over- 
whelming evidence for quantum mechanics. Yet he concedes 
'initial success' and 4 a considerable degree of truth'. He ob- 
viously agrees that we have at present nothing better, but he 
hopes that this will be achieved later, for he rejects the 'dice- 
playing god'. I have discussed the chances of a return to deter- 
minism and found them slight. I have tried to show that classical 
physics is involved in no less formidable conceptional difficulties 
and had eventually to incorporate chance in its system. We 
mortals have to play dice anyhow if we wish to deal with atomic 
systems. Einstein's principle of the existence of an objective 
real world is therefore rather academic. On the other hand, his 
contention that quantum theory has given up this principle is 
not justified, if the conception of reality is properly understood. 
Of this I shall say more presently. 

Einstein's letters teach us impressively the fact that even an 
exact science like physics is based on fundamental beliefs. The 
words ich glaube appear repeatedly, and once they are under- 
lined. I shall not further discuss the difference between Ein- 
stein's principles and those which I have tried to extract from 
the history of physics up to the present day. But I wish to 
collect some of the fundamental assumptions which cannot be 
further reduced but have to be accepted by an act of faith. 


Causality is such a principle, if it is defined as the belief in the 
existence of mutual physical dependence of observable situa- 
tions. However, all specifications of this dependence in regard 
to space and time (contiguity, antecedence) and to the infinite 
sharpness of observation (determinism) seem to me not funda- 
mental, but consequences of the actual empirical laws. 

Another metaphysical principle is incorporated in the notion 
of probability. It is the belief that the predictions of statistical 
calculations are more than an exercise of the brain, that they can 
be trusted in the real world. This holds just as well for ordinary 
probability as for the more refined mixture of probability and 
mechanics formulated by quantum theory. 

The two metaphysical conceptions of causality and probability 
have been our main theme. Others, concerning logic, arithmetic, 
space, and time, are quite beyond the frame of these lectures. 
But let me add a few more which have occasionally occurred, 
though I am sure that my list will be quite incomplete. One is 
the belief in harmony in nature, which is something distinct from 
causality, as it can be circumscribed by words like beauty, 
elegance, simplicity applied to certain formulations of natural 
laws. This belief has played a considerable part in the develop- 
ment of theoretical physics remember Maxwell's equations of 
the electromagnetic field, or Einstein's relativity but how far 
it is a real guide in the search of the unknown or just the expres- 
sion of our satisfaction to have discovered a significant relation, 
I do not venture to say. For I have on occasion made the sad 
discovery that a theory which seemed to me very lovely neverthe- 
less did not work. And in regard to simplicity, opinions will 
differ in many cases. Is Einstein's law of gravitation simpler 
than Newton's? Trained mathematicians will answer Yes, 
meaning the logical simplicity of the foundations, while others 
will say emphatically No, because of the horrible complication 
of the formalism. However this may be, this kind of belief may 
help some specially gifted men in their research; for the validity 
of the result it has little importance (see Appendix, 35). 

The last belief I wish to discuss may be called the principle of 
objectivity. It provides a criterion to distinguish subjective 


impressions from objective facts, namely by substituting for 
given sense-data others which can be checked by other indivi- 
duals. I have spoken about this method when I had to define 
temperature: the subjective feeling of hot and cold is replaced 
by the reading of a thermometer, which can be done by any 
person without a sensation of hot or cold. It is perhaps the most 
important rule of the code of natural science of which innumer- 
able examples can be given, and it is obviously closely related 
to the conception of scientific reality. For if reality is understood 
to mean the sum of observational invariants and I cannot see 
any other reasonable interpretation of this word in physics the 
elimination of sense qualities is a necessary step to discover 

Here I must refer to the previous Waynflete Lectures given by 
Professor E. D. Adrian, on The Physical Background of Percep- 
tion, because the results of physiological investigations seem to 
me in perfect agreement with my suggestion about the meaning 
of reality in physics. The messages which the brain receives have 
not the least similarity with the stimuli. They consist in pulses 
of given intensities and frequencies, characteristic for the trans- 
mitting nerve-fibre, which ends at a definite place of the cortex. 
All the brain 'learns' (I use here the objectionable language of 
the 'disquieting figure of a little hobgoblin sitting up aloft in the 
cerebral hemisphere') is a distribution or 'map' of pulses. From 
this information it produces the image of the world by a process 
which can metaphorically be called a consummate piece of com- 
binatorial mathematics: it sorts out of the maze of indifferent 
and varying signals invariant shapes and relations which form 
the world of ordinary experience. 

This unconscious process breaks down for scientific ultra- 
experience, obtained by magnifying instruments. But then it is 
continued in the full light of consciousness, by mathematical 
reasoning. The result is the reality offered by theoretical physics. 

The principle of objectivity can, I think, be applied to every 
human experience, but is often quite out of place. For instance: 
what is a fugue by Bach ? Is it the invariant cross-section, or the 
common content of all printed or written copies, gramophone 


records, sound waves at performances, etc., of this piece of music ? 
As a lover of music I say No! that is not what I mean by a fugue. 
It is something of another sphere where other notions apply, 
and the essence of it is not 'notions' at all, but the immediate 
impact on my soul of its beauty and greatness. 

In cases like this, the idea of scientific objective reality is 
obviously inadequate, almost absurd. 

This is trivial, but I have to refer to it if I have to make good 
my promise to discuss the bearing of modern physical thought 
on philosophical problems, in particular on the problem of free 
will. Since ancient times philosophers have been worried how 
free will can be reconciled with causality, and after the tremen- 
dous success of Newton's deterministic theory of nature, this 
problem seemed to be still more acute. Therefore, the advent of 
indeterministic quantum theory was welcomed as opening a 
possibility for the autonomy of the mind without a clash with 
the laws of nature. Free will is primarily a subjective pheno- 
menon, the interpretation of a sensation which we experience, 
similar to a sense impression. We can and do, of course, project 
it into the minds of our fellow beings just as we do in the case 
of music. We can also correlate it with other phenomena in order 
to transform it into an objective relation, as the moralists, 
sociologists, lawyers do but then it resembles the original 
sensation no more than an intensity curve in a spectral diagram 
resembles a colour which I see. After this transformation into 
a sociological concept, free will is a symbolic expression to 
describe the fact that the actions and reactions of human beings 
are conditioned by their internal mental structure and depend on 
their whole and unaccountable history. Whether we believe 
theoretically in strict determinism or not, we can make no use 
of this theory since a human being is too complicated, and we 
have to be content with a working hypothesis like that of spon- 
taneity of decision and responsibility of action. If you feel that 
this clashes with determinism, you have now at your disposal 
the modern indeterministic philosophy of nature, you can assume 
a certain 'freedom', i.e. deviation from the deterministic laws, 
because these are only apparent and refer to averages. Yet if 


you believe in perfect freedom you will get into difficulties again, 
because you cannot neglect the laws of statistics which are laws 
of nature. 

I think that the philosophical treatment of the problem of free 
will suffers often (see Appendix, 36) from an insufficient dis- 
tinction between the subjective and objective aspect. It is 
doubtless more difficult to keep these apart in the case of such 
sensations as free will, than in the case of colours, sounds, or 
temperatures. But the application of scientific conceptions to a 
subjective experience is an inadequate procedure in all such 

You may call this an evasion of the problem, by means of 
dividing all experience into two categories, instead of trying to 
form one all-embracing picture of the world. This division is 
indeed what I suggest and consider to be unavoidable. If quan- 
tum theory has any philosophical importance at all, it lies in the 
fact that it demonstrates for a single, sharply defined science the 
necessity of dual aspects and complementary considerations. Niels 
Bohr has discussed this question with respect to many applica- 
tions in physiology, psychology, and philosophy in general. 
According to the rule of indeterminacy, you cannot measure 
simultaneously position and velocity of particles, but you have 
to make your choice. The situation is similar if you wish, for 
instance, to determine the physico-chemical processes in the 
brain connected with a mental process: it cannot be done because 
the latter would be decidedly disturbed by the physical investiga- 
tion. Complete knowledge of the physical situation is only 
obtainable by a dissection which would mean the death of the 
living organ or the whole creature, the destruction of the mental 
situation. This example may suffice; you can find more and 
subtler ones in Bohr's writings. They illustrate the limits of 
human understanding and direct the attention to the question 
of fixing the boundary line, as physics has done in a narrow 
field by discovering the quantum constant ft. Much futile 
controversy could be avoided in this way. To show this by a 
final example, I wish to refer to these lectures themselves which 
deal only with one aspect of science, the theoretical one. There 


is a powerful school of eminent scientists who consider such 
things to be a futile and snobbish sport, and the people who 
spend their time on it drones. Science has undoubtedly two 
aspects: it can be regarded from the social standpoint as a prac- 
tical collective endeavour for the improvement of human 
conditions, but it can also be regarded from the individualistic 
standpoint, as a pursuit of mental desires, the hunger for know- 
ledge and understanding, a sister of art, philosophy, and religion. 
Both aspects are justified, necessary, and complementary. The 
collective enterprise of practical science consists in the end of 
individuals and cannot thrive without their devotion. But 
devotion does not suffice; nothing great can be achieved without 
the elementary curiosity of the philosopher. A proper balance is 
needed. I have chosen the way which seemed to me to harmonize 
best with the spirit of this ancient place of learning. 


1. (II. p. 8.) Multiple causes 

Any event may have several causes. This possibility is not 
excluded by my definition (given explicitly on p. 9), though I 
speak there of A being 'the' cause of the effect B. Actually the 
'number* of causes, i.e. of conditions on which an effect B 
depends, seems to me a rather meaningless notion. One often 
finds the idea of a 'causal chain' A^A^,..., where B depends 
directly on A V A^ on A 2 , etc., so that B depends indirectly on 
any of the A n . As the series may never end where is a 'first 
cause' to be found ? the number of causes may be, and will be 
in general, infinite. But there seems to be not the slightest 
reason to assume only one such chain, or even a number of 
chains; for the causes may be interlocked in a complicated way, 
and a 'network' of causes (even in a multi-dimensional space) 
seems to be a more appropriate picture. Yet why should it be 
enumerable at all? The 'set of all causes' of an event seems to 
me a notion just as dangerous as the notions which lead to logical 
paradoxes of the type discovered by Russell. It is a metaphysical 
idea which has produced much futile controversy. Therefore I 
have tried to formulate my definition in such a way that this 
question can be completely avoided. 

2. (III. p. 13.) Derivation of Newton's law from Kepler's 

The fact that Newton's law is a logical consequence of Kepler's 
laws is the basis on which my whole conception of causality in 
physics rests. For it is, apart from Galileo's simple demonstra- 
tion the first and foremost example of a timeless cause-effect 
relation derived from observations. In most text-books of 
mechanics the opposite way (deduction of Kepler's laws from 
Newton's) is followed. Therefore it may be useful to give the 
full proof in modern terms. 

We begin with formulating Kepler's laws, splitting the first 
one in two parts: 

la. The orbit of a planet is a plane curve. 

16. It has the shape of an ellipse, one focus of which is the sun. 


II. The area A swept by the radius vector increases propor- 

tionally to time. 

III. The ratio of the cube of the semi-axis a of the ellipse to 
the square of the period T is the same for all planets. 

From I a it follows that it suffices to consider a plane, intro- 
ducing rectangular coordinates x,y, polar coordinates r, ^, so 

x = rcos</> ) y = rsin<. 

Indicating differentiation with respect to time by a dot, one 
obtains for the velocity 

rsn, y = 
and for the acceleration 

x = a r cos</> cfysin<, y 
where a r = r r<j> 2 , (I) 

0,4 = 2ty+r (2) 

are the radial and tangential components of the acceleration. 

Next we use II. The element of the area in polar coordinates 
is obviously 

FIG. 3. 

If the origin is taken at the centre of the sun, the rate of in- 
crease of A is constant, say $h, dA = %hdt, or 

2A = rty = A - (3) 

Now it is convenient to use the variable 

u = 


instead of r and to describe the orbit by expressing u as a function 
of<j> y u((f>). Then , 


TN t **/ ; 1 UU ; 7 tttt- / 

Further, r = < = < = /& . (6) 

Substituting (4), (5), (6) into (2), one finds 

Hence the acceleration has only a radial component a r with 
respect to the sun. To obtain the value of a r we calculate, with 
the help of (4), 

. __ dr - __ h dr __ __, du 

" = A 
and substitute this in (1): 

we use 16. The polar equation of an ellipse is 

where q is the semi-latus rectum and e the numerical eccentricity; 
or u = -(l+cos^). 

From this, one obtains 

du . . d*u e . 

= -- sm<i, -r- 9 = cos^, 
d<f> q Y d<t>* q Y 

hence from (8) a r = -- u 2 == --- ^. (10) 

The acceleration is directed to the sun (centripetal) and is 
inversely proportional to the square of the distance. 


According to the third law III one can write 

* til) 

I 11 * 

where the constant /z is the same for all planets. 
Now integrating (3) for a full revolution one has 

2A = hT. (12) 

On the other hand, the area of an ellipse is given by 

A = nab, (13) 

where a and b are the major and minor semi-axes. 

Taking in (9) <j4 = and <f> = TT one gets the aphelion and 
perihelion distances; half of the sum of those is the semi-major 


while the semi-minor axis is given by 

hence aq = 6 2 . 

Substituting this in (13), one gets from (12) 

solving with respect to h 2 /q and using (11): 
A 2 

Therefore the law of acceleration (10) becomes 

a, = _, (15) 

where p is the same for all planets, hence a property of the sun, 
called the gravitational mass. 

This demonstrates the statement of the text that Newton's 
derivation of his law of force is purely deductive, based on the 
inductive work of Ty cho Brahe and Kepler. The new feature due 
to Newton is the theoretical interpretation of the deduced formula 
for the acceleration, as representing the *cause' of the motion, 


or the force determining the motion, which then led him to the 
fundamental idea of general gravitation (each body attracts each 
other one). In the text-books this situation is not always clear; 
this may be due to Newton's own representation in his Principia 
where he uses only geometrical constructions in the classical 
style of the Greeks. Yet it is known that he possessed the 
methods of infinitesimal calculus (theory of fluxions) for many 
years. I do not know whether he actually discovered his results 
with the help of the calculus; it seems to me incredible that he 
should not. He was obviously keen to avoid new mathematical 
methods in order to comply with the taste of his contemporaries. 
But it is known also that he liked to conceal his real ideas by 
dressing them up. This tendency is found in Gauss and other 
great mathematicians as well and has survived to our time, 
much to the disadvantage of science. 

Newton regarded the calculation of terrestrial gravity from 
astronomical data as the crucial test of his theory, and he with- 
held publication for years as the available data about the radius 
of the earth were not satisfactory. The formula (3.3) of the text 
is simply obtained by regarding the earth as central body and 
the moon as 'planet'. Then //, is the gravitational mass of the 
earth which can be obtained from (11) by inserting for a the 
mean distance B of the centre of the moon from that of the earth, 
and for T the length of the month. Substituting ^ = 47r 2 JB 8 /T 2 
into (15), where r is the radius of the earth, one obtains for the 
acceleration on the earth's surface g ( = a r ) the formula (3.3) 
of the text, 

If here the values Rjr = 60, jR = 3-84x!0 10 cm., and 
T = 27 d 7 h 43 m 11 -5 s = 2-361X10 6 sec. are substituted, one 
finds g = 980 *2 cm. sec.~ 2 , while the observed value (extrapolated 
to the pole) is g = 980 '6 cm. sec.~ 2 

This reasoning is based on the plausible assumption that the 
acceleration produced by a material sphere at a point outside is 
independent of the radial distribution of density and the mass of 
the sphere can therefore be regarded as concentrated in the 
centre. The rigorous proof of this lemma forms an important 
part of Newton's considerations and was presumably achieved 
with the help of his theory of fluxions. 


3. (IV. p. 20.) Cauchy's mechanics of continuous media 

The mathematical tool for handling continuous substances is 
the following theorem of Gauss (also attributed to Green). 

If a vector field A is defined inside and on the surface S of a 
volume F, one has 

JdivAdF = f A.nAS, (1) 

V 8 

where n is the unit vector in the direction of the outer normal of 
the surface element dS and 

,. A dA x , 8A V 8A V d A /ox 

divA ~ -H HJ = .A. (2) 

dx ^ dy ^ dz dx v ' 

If p is the density, the total mass inside V is 

m = J p dV. (3) 

The amount of mass leaving the volume through the surface is 

u . n dS, 

where u = pv is the current, v the velocity. 
The indestructibility of mass is then expressed by 


u.ndS == 0. 

Substituting (3) and applying (1), one obtains a volume 
integral, which vanishes for any surface; hence its integrand 
must be zero: p+divu = 0. (4) 

This is the continuity equation (4.5) of the text. 

Consider now the forces acting on the volume F. Neglecting 
those forces which act on each volume element (like Newton's 
gravitation), we assume with Cauchy that there are surface 
forces or tensions, acting on each element dS of the surface S, 
and proportional to dS. They will also depend on the orientation 
of dS, i.e. on the normal vector n, and can therefore be written 
T n dS. If n coincides with one of the three axes of coordinates 
x,y,z, the corresponding forces per unit area may be repre- 
sented by the vectors T^, T y) T e . Now the projections of an 
element dS on the coordinate planes are 

dS x = n x dS, dS y = n y dS, dS z = n z dS. 


The equilibrium of the tetrahedron with the sides dS, dS x> dS y , dS 8 
then leads to the equation 

T n dS = T x dS x +T y dS y +T z dS Z9 

or T n = T x n x +T y n y +T z n z , (5) 

which is the formula (4.6) of the text. 

FIG. 4. 

Consider further the equilibrium of a rectangular volume 
element, and in particular its cross-section 2 = 0, with the sides 
dx, dy. The components of T^ in this plane may be denoted by 
T xx and T xyj those of T y by T yx , T yy . Then the tangential com- 

FIG. 5. 

ponents on the surfaces dydz and dxdz produce a couple about the 
origin with the moment 

(T xy dydz)dx- (T yx dxdz)dy. 

This must vanish in equilibrium; therefore one has 

T T 

-*-xy -*!/# 

and the corresponding equations obtained by cyclic permutation 
of the indices, (4.7) of the text. Hence the stress tensor T defined 


by (4.8) is symmetrical. One can express this, with the help of 
(5), in the form 

(T n ) x = T xx n x +T yx n v +T zx n s = T x .n, 

where T x is the vector (T xx , T xv , T xz ). 

The ^-component of the total force F = f T n dS can now be 
transformed with the help of formula (1) into a volume integral 

F x = f (T n ) x dS = J T x .n dS = j divT x dV. 

Using the tensor notation of the text, (4.10), one can write this 
F= J divTW. (6) 

This has to be equated to the rate of change of momentum of a 
given amount of matter, i.e. enclosed in a volume moving in 
time. One has for any function O of space 

d f - 1 1 f f 1 

_J <DdF=jm-I J OrfF-JOdFj 

V F<+A<) v(t) ' 

= lim-ij f <\tdV+ !<bdV\. 

A I I r\J ' I I 

A^->0 /AC I J 01 J j 

^17<'/^ AIT" ' 


The second integral is extended over the volume between two 
infinitesimally near positions of the surface, so that 

dV = n.v&tdS 
and therefore 

fOn.vdSA*= f div(<&v)dFAt 
S f 

Hence ^ \{ <D dV = f {^+div(Ov)j dV. (7) 


If this is applied to the components of the momentum density 
pv one obtains for the rate of change of the total momentum P: 


Here the second integral vanishes in consequence of the continu- 
ity equation (4), with u = />v. In the first integral appears the 
convective derivative, defined by (4.11) of the text, 


Now the equation of motion 

? = * 

reduces in virtue of (6) and (8) to (4.9) of the text: 

P ^ = divT. (9) 

Consider in particular an elastic fluid where 

rp _ rn _ /TT _ _ sy* /Ti _ np _ rr\ _ r\ 

J-xx ~*yy -*-zz Jr> J -yz ^zx ^xy v > 

and the pressure p is a function of /> alone. Then the continuity 
equation and the equations of motion 

pv) = 0, 

are four differential equations for the four functions />, v x , v y9 v z . 
If one wishes to determine small deviations from equilibrium, 
then v and <f> = p p are small and /> constant with regard to 
space and time. Then the last two equations reduce in first 
approximation to ^ . 


By differentiating the first of these with respect to time and 


substituting p Q from the second, one finds 


... ,. 3 a d 

or with div = . = A 

dx ax ax 

This is the equation (4.13) of the text applied to the variation of 
density <f>. Each of the velocity components satisfies the same 
equation, which is the prototype of all laws of wave propagation. 

4. (IV. p. 24.) Maxwell's equations of the electromagnetic 

The mathematical part of Maxwell's work consisted in con- 
densing the experimental laws, mentioned in the text, in a set 
of differential equations which, with the usual notation, are 

div D = 4?rp, curl H == u, 


divB = 0, curlE+~B = 0, (1) 


D = E, B = pH. 

To give a simple example, Coulomb's law for the electrostatic 
field is obtained by putting B = 0, H = 0, u = 0; then there 

remains ,. ~ A 1T ^ 

divD = 47T/>, curlE = 0. 

The second equation implies that there is a potential 0, such that 



In vacuOy where D = E, one obtains therefore Poisson's 

equation ,. . . 

^ divE = A( = 

The solution is <f> = f ^dV, (2) 

provided singularities are excluded; this formula expresses 
Coulomb's law for a continuous distribution of density. In a 
similar way one obtains for stationary states (B = 0) the law of 
Biot and Savart for the magnetic field of a current of density u. 


Maxwell's physical idea consisted in discovering the asym- 
metry in the equations (1) which, in our style of writing, is 

obvious even to the untrained eye: the missing term -6 in the 


second equation. The logical necessity of this term follows from 
the fact of the existence of open currents, e.g. discharges of con- 
densers through wires. In this case the charge on the condenser 
changes in time, hence p ^ 0; on the other hand, the equations 

(1) imply divu = divcurlH = 0. Therefore the continuity 

equation />+divu = is violated. 

To amend this Maxwell postulated a new type of current 
bridging the gap between the conductors in the condenser, with 
a certain density w, so that 

curlH = (u+ w). (3) 


Then taking the div operation one has 

div w div u = p = divD. 


The simplest way of satisfying this equation is putting 

W = ^D, (4) 

so that the corresponding equation in Maxwell's set becomes 

~iD = u (5) 

c c v ' 

and complete symmetry between electric and magnetic quanti- 
ties is obtained (apart from the fact that the latter have no true 
charge and current). 

The modified system of field equations permits the prediction 
of waves with finite velocity. In an isotropic substance free of 
charges and currents (p = 0, u = 0, D = eE, B = /^H) one has 

curlH--E = 0, curlE+^H = 0; 
c c 

taking the curl of one of them, and using the formula that for a 


vector with vanishing div one has curl curl = A, one obtains 
for each component of E and H the wave equation 

For vacuum (c = /x = 1) the velocity of propagation should 
therefore be equal to the electromagnetic constant c. As stated 
in the text, this constant has the dimensions of a velocity and can 
be measured by determining the magnetic field of a current 
produced by a condenser discharge (measured therefore electro- 
statically). Such experiments had been performed by Kohl- 
rausch and Weber, and their result for c agreed with the velocity 
of light in vacuo. This evidence for the electromagnetic theory of 
light was strongly enhanced by experiments carried out by 
Boltzmann, which showed that the velocity of light in simple 
substances (rare gases, which are monatomic) can be calculated 
from their dielectric constant e (p being practically = 1 ) with the 
help of Maxwell's formula c x = c/Ve. 

Maxwell's formulation satisfies contiguity, but its relation to 
Cauchy's form of the dynamical laws has still to be established. 
The electric and magnetic field vectors, though originally defined 
by the forces on point charges and magnetic poles (which actually 
do not exist), are defined by the equations also in places where 
neither charges nor currents exist. Yet they are not stresses 
themselves; they are analogous to strains, on which the stresses 
depend. The law of this connexion has also been found by Max- 
well; it is a mathematical formulation of Faraday's intuitive 
interpretation of the mechanical reactions between electrified 
and magnetized bodies. A short indication must here suffice. 

Apart from the electric force on a point charge e, F = eE, 
there exists a mechanical force on the element of a linear current 
u, produced by a magnetic field H; this force is perpendicular 
to H and to the current u and therefore does no work. It is not 
quite uniquely determined, as one can obviously add any force 
whose line integral over a closed circuit vanishes. The simplest 
expression is: 

as can be seen by considering the change of magnetic energy 
|TT J H . B dV produced by a virtual displacement of an element 
of the current. 


To illustrate Maxwell's procedure it suffices to consider charge 
distributions in vacuo with density p and current u = />v. 

Combining the two forces F = eE and F = - u A B into one 


expression, one has for the density of force 


the so-called Lorentz force. 

Substituting here for p and pv the expressions from Maxwell's 
equations one can, by elementary transformations, bring f into 
the form of Cauchy , . __ , . - 


These are the celebrated formulae of Maxwell's tensions. They 
can be easily generalized for material bodies with dielectric 
constant and permeability, and they have become the prototype 
for similar expressions in other field theories, e.g. gravitation 
(Einstein), electronic field (Dirac), meson field (Yukawa). 

5. (IV. p. 27.) Relativity 

It is impossible to give a short sketch of the theory of relativity, 
and the reader is referred to the text-books. The best representa- 
tion seems to me still the article in vol. v of the Mathematical 
Encyclopaedia written by W. Pauli when he was a student, 
about twenty years of age. There one finds a clear statement of 
the experimental facts which led to the mathematical theory 
almost unambiguously. Eddington's treatment gives the im- 
pression that the results could have been obtained or even 
have been obtained by pure reason, using epistemological 
principles. I need not say that this is wrong and misleading. 
There was, of course, a philosophical urge behind Einstein's 
relentless effort; in particular the violation of contiguity in 
Newton's theory seemed to him unacceptable. Yet the greatness 
of his achievement was just that he based his own theory not on 
preconceived notions but on hard facts, facts which were obvious 


to everybody, but noticed by nobody. The main fact was the 
identity of inertial and gravitational mass, which he expressed 
as the principle of equivalence between acceleration and gravita- 
tion. An observer in a closed box cannot decide by any experi- 
ment whether an observed acceleration of a body in the box is 
due to gravity produced by external bodies or to an acceleration 
of the box in the opposite direction. This principle means that 
arbitrary, non-linear transformations of time must be admitted. 
But the formal symmetry between space-coordinates and time 
discovered by Minkowski made it very improbable that the 
transformations of space should be linear, and this was corro- 
borated by considering rotating bodies: a volume element on the 
periphery should undergo a peripheral contraction according 
to the results of special relativity, but remain unchanged in the 
radial direction. Hence acceleration was necessarily connected 
with deformation. This led to the postulate that all laws of 
nature ought to be unchanged (covariant) with respect to 
arbitrary space-time transformations. But as special relativity 
must be preserved in small domains, the postulate of in variance 
of the line element had to be made. 

The long struggle of Einstein to find the general covariant field 
equations was due to the difficulty for a physicist to assimilate 
the mathematical ideas necessary, ideas which were in fact 
completely worked out by Riemann and his successors, Levi- 
Civita, Ricci, and others. 

I wish to add here only one remark. The physical significance 
of the line element seems to me rather mystical in a genuinely 
continuous space-time. If it is replaced by the assumption of 
parallel displacement (affine connexion), this impression of 
mystery is still further enhanced. On the other hand, the 
appearance of a> finite length in the ultimate equations of physics 
can be expected. Quantum theory is the first step in this direc- 
tion; it introduces not a universal length but a constant, Planck's 
K y of the dimension length times momentum into the laws of 
physics. There are numerous indications that the further 
development of physics will lead to a separate appearance of 
these two factors, h = q. p, in the ultimate laws. The difficulties 
of present-day physics are centred about the problem of intro- 
ducing this length q in a way which satisfies the principle of 
relativity. This fact seems to indicate that relativity itself 


needs a generalization where the infinitesimal element ds is 
replaced by a finite length. 

The papers quoted in the text are: A. Einstein, L. Infeld, and 
B. Hoffmann, Ann. of Math., 39, no. 1, p. 65 (Princeton, 
1938); V. A. Fock, Journ. of Phys. U.S.S.R. 1, no. 2, p. 81 (1939). 

6. (V. p. 38.) On classical and modern thermodynamics 

It is often said that the classical derivation of the second law 
of thermodynamics is much simpler than Carath6odory's as it 
needs less abstract conceptions than Pfaffian equations. But 
this objection is quite wrong. For what one has to show is the 
existence of an integrating denominator of dQ. This is trivial 
for a Pfaffian of two variables (representing, for example, a single 
fluid with F, #) ; it must be shown not to be trivial and even, in 
general, wrong for Pfaffians with more than two variables (e.g. two 
fluids in thermal contact with Tj, !,#). Otherwise, the student 
cannot possibly understand what the fuss is all about. But that 
means explaining to him the difference between the two classes 
of Pfaffians of three variables, the integrable ones and the non- 
integrable ones. Without that all talk about Carnot cycles is just 
empty verbiage. But as soon as one has this difference, why not 
then use the simple criterion of accessibility from neighbouring 
points, instead of invoking quite new ideas borrowed from 
engineering ? I think a satisfactory lecture or text-book should 
bring this classical reasoning as a corollary of historical interest, 
as I have suggested long ago in a series of papers (Phys. Zeitschr. 
22, pp. 218, 249, 282 (1921)). 

Since writing the text I have come across one book which 
gives a short account of Carath^odory's theory, H. Margenau 
and G. M. Murphy, The Mathematics of Physics and Chemistry 
(D. van Nostrand Co., New York, 1943), 1.15, p. 26. But 
though the mathematics is correct, it does not do justice to the 
idea. For it says on p. 28: 'This formal mathematical con- 
sequence of the properties of the Pfaff equation [namely the 
theorem proved in the next section of the appendix] is known as 
the principle of Carath^odory. It is exactly what we need for 
thermodynamics.' Carath6odory's principle is, of course, not 
that formal mathematical theorem but the induction from obser- 
vation that there are inaccessible states in any neighbourhood 
of a given state. 


7. (V. p. 39.) Theorem of accessibility 

An example of a Pfaffian which has no integrating denomina- 
tor (by the way, the same example as described in geometrical 
terms in the text) is this: 

dQ = y dx-\-x dy+k dz, 

where Tc is a constant. If it were possible to write dQ in the 
form Ad^, where A and < are functions of x, y, z, one would have 

d<f> __ _^y B(f> __ x dcf> __k 

aJ~~~~A' 3y~A' 0z~"A' 


= 1/f \ - 17*\ ^ - !L(v\ - fe\ 

~~ dz \ A/ ~~ dy \A/ ' Bzdx ~ dz \ A/ "~ dx ( A/ ' 

dxdy c 


dz dy dz dx dx dy 

By substituting dXjdx and dX/dy from the first two equations 
in the third one finds A = 

Examples like this show clearly that the existence of an 
integrating denominator is an exception. 

We now give the proof of the theorem of accessibility. 
Consider the solutions of the Pfaffian 

dQ = X dx+Y dy+Z dz = 0, (1) 

which lie in a given surface 8, 

x = x(u, v) y y = y(u, v), z = z(u, v). 
They satisfy a Pfaffian 

v = 0, (2) 

where U = X+Y+Z-, 

du^ du^ du 




Hence through every point PofS there passes one curve, because 
(2) is equivalent to the ordinary differential equation 



which has a one-parameter set of solutions <f>(u,v) = const., 
covering the surface S. 

Let usnowsupposethat,in the neighbourhood of apointP, there 
are inaccessible points; let Q be one of these. Construct through 

FIG. 6. 

FIG. 7. 

P a straight line J2 7 , which is not a solution of (1 ), and the plane ir 
through Q and JSf. In TT there is just one curve satisfying (1) and 
going through Q\ this curve will meet the line & at a point R. 
Then R must be inaccessible from P; for if there should exist a 
solution leading from P to JB, then one could also reach Q from 
P by a continuous (though kinked) solution curve, which 
contradicts the assumption that Q is inaccessible from P. The 
point R can be made to lie as near to P as one wishes by choosing 
Q near enough to P. 

Now we move the straight line JS? parallel to itself in a cyclic 
way so that it describes a closed cylinder. Then there exists on 
this cylinder a solution curve # which starts from P on .5? and 
meets JS? again at a point N. It follows that N and P must coin- 
cide. For otherwise one could, by deforming the cylinder, make 
N sweep along the line JS? towards P and beyond P. Hence there 
would be an interval of accessible points (like N) around P, 

5131 T . 


while it has been proved before that there are inaccessible points 
Q in any neighbourhood of P. 

As N now coincides with P the connecting curve ^ can be 
made, by steady deformation of the cylinder, to describe a 
surface which contains obviously all solutions starting from P. 
If this surface is given by <f>(x, y, z) = 0, one has 

dQ = A <ty, 

which is the theorem to be proved. 

The function </> and the factor A are not uniquely determined; 
if <f> is replaced by $(<) one has 

Xd<f> = A*D with A = A~. 


8. (V. p. 43.) Thermodynamics of chemical equilibria 

Carath6odory's original publication on his foundation of 
thermodynamics (Math. Ann. 61, p. 355, 1909) is written in a 
very abstract way. He considers a type of systems which are 
called simple and defined by the property that of the parameters 
necessary to fix a state of equilibrium all except one are con- 
figurational variables, i.e. such that their values can be arbi- 
trarily prescribed (like volumes). In my own presentation of the 
theory, of 1 92 1 (quoted in Appendix, 6), there is only a hint at the 
end ( 9) how such variables can be introduced in more complica- 
ted cases, as for instance for chemical equilibria where the concen- 
trations of the constituents can be changed. I hoped at that time 
that this might be worked out by the chemists themselves, for it 
needs nothing more than the usual method of semi-permeable 
walls with a slight modification of the wording. As this has not 
happened, I shall give here a short indication how to do it. 

I consider first a simple fluid (without decomposition), but 
arrange it in such a way that volume V and mass M are both 
independently changeable. For this purpose one has to imagine 
a cylinder with a piston attached to the volume V, connected 
by a valve, through which substance can be pressed into the 
volume V considered. The position of the auxiliary piston 
determines uniquely the mass M contained in V\ hence M can 
be regarded as a configuration variable in Carath^odory's sense. 
If the valve is closed, V can be changed, by moving the 'main' 
piston, without altering M . Hence M and V are both indepen- 



dent configuration variables, and the work done for any change 
of them must be regarded as measurable. If this work is deter- 
mined adiabatically one obtains the energy function, say in 
terms of F, M and the empirical temperature #, U(V 9 M,&). 

FIG. 8. 

When this is known the differential of heat is defined by the 
difference JQ __ ^jjj^^dV dM (1} 

where p and p arc functions of the state (V,M,&) like U, which 
can be regarded as empirically known. 

Now one has in (1) a Pfaffian of three variables and can apply 
the same considerations as before which lead to the result that dQ 
is integrable and can be equated to T dS. Hence one can write 

dU = TdS-pdV+pdM. (2) 

But U must be a homogeneous function of the first order in the 
variables S, V, M. If one introduces the specific variables 

U*S*V DV TT -r n -JT TT ir /\\ 

, o, v vj u JI U) js = Ms, V = M v, (3) 

one has according to Euler's theorem 

u = Ts-pv+p, (4) 

i m du du /^x 

where 1 = , p = , (5) 

and then, from (2), du = Tdspdv. (6) 

If the substance inside F is a chemical compound and one wishes 
to investigate its decomposition into n components, one assumes 
n cylinders with pistons attached to F, separated from F by 
semi-permeable walls, each of which allows the passage of only 
one of the components. Then one has in the same way 

dU = TdS-pdV+ 



where F, M v M z ,..., M n can be regarded as configuration variables 
and U, p, fa, p z ,... y p n as known functions of these. Now, as above, 
the specific energy, entropy, and volume are introduced and 
further the concentrations c f by 

M^c.M, (8) 

where M is the total mass: M = ] M it hence 

=1- (9) 

One obtains from Euler's theorem 


.,, m U U U /11X 

with T = -, p = _-, ft = _. (11) 

where the differentiation with respect to the c { is performed as 
if they were independent; and 

du = Tds pdv+ Pidct. (12) 

The formalism of thermodynamics consists in deriving rela- 
tions between the variables by differentiating the equations 
(11), e.g. BT e d 

== . _ci = etc. 
dv ds dv dcS 

As experiments are often performed at constant pressure or 
temperature or both, one uses instead of u(s 9 v 9 c l9 ... 9 c n ) 9 the 
functions free energy uTa = /, enthalpy u+pv = w, or free 
enthalpy p = u+pvTs (defined by (4)). For the latter, for 
example, it follows from (12) that 

dp = sdT+vdp-}- J Pidc i9 (13) 

= --> v = f p > M* = . 

so that for T = const., p = const., one has simply 

^ = |>t-<^. (15) 

The most important theorem, which follows from these general 


equations, is Gibbs's phase rule. The system may exist in differ- 
ent phases if the n equations 

^(T,p,c^c n ) = C< (t = 1,..., n) (16) 

have several solutions. Let ra be the number of independent 
solutions; then there are m phases which can be ordered in such a 
way that each has contact with two others only. Hence there 
are ml interfaces and n equations of the form (16) for each, 
altogether n(mI) independent equations. On the other hand, 
there are (nl) independent concentrations c^...,c n ^ for each 
phase, i.e. m(n I) for the whole system, to which the two vari- 
ables p, T have to be added; hence m(n l)+2 independent 
variables. The number of arbitrary parameters, or the number of 
degrees of freedom of the system, is therefore 

m(n 1)+2 -n(m 1) = n m+2. 

So for a single pure substance, n = 1, the number of degrees of 
freedom is 3 m; hence there are three cases m = 1, 2, 3 corre- 
sponding to one phase, two or three coexisting phases; more than 
three phases cannot be in equilibrium. All further progress in 
thermodynamics is based on special assumptions about the func- 
tions involved, either prompted by experiment, or chosen by an 
argument of simplicity, or and this is the most important 
step derived from statistical considerations. 

9. (V. p. 44.) Velocity of sound in gases 

The simple problem of calculating the adiabatic law for an 
ideal gas gives me the opportunity to show how the theory of 
Carath^odory determines uniquely the absolute temperature 
and entropy. 

The ideal gas is defined by two properties: (1) Boyle's law, 
the isotherms are given by pV = const.; (2) the same quantity 
pV remains constant if the gas expands without doing work. In 
mathematical symbols, 

pV = F(&), U = U(). 

Hence dQ = dU+p dV = U'd&+F. (1) 

If e is defined by 

. Tin . 

- 3 


(1) can be written 

dQ = F() dlog(OV); 

hence one can put A = JF(#), < = log(0F) and obtain from the 
equation (5.25) of the text 

~ ~~a#~~ ~~ a ' 

Then (5.27) gives, writing C = 1/JB, the usual form of the 
equation of state RT = F(&) = ^ {3) 

and S = S Q +Rlog(0V). 

If the special assumption is made, that U depends linearly on 
pV (which holds for dilute gases with the same approximation 
as Boyle's law), one has U = c v T and, from (2), 

The entropy becomes therefore 

S=S Q +log(T*<V R ), (4) 

or, substituting p for T from (3), 

S^S 1 +log(p^V c p) 9 (5) 

where c p = c v +R (6) 

is the specific heat for constant pressure. Hence the adiabatic 
law S = const, is equivalent to 

= const., y = ^, (7) 

which is identical with the equation p = ap? in the text, as the 
density p is reciprocal to the volume F. 
The velocity of Sound was calculated in Appendix, 3; according 

to (3.10) it is . , 

I P 

c ~~~ iJTp' 
J? T the isothermal law, p = ap, this means 

c= /? 


while, for the adiabatic law, p = apY, one finds 

_ IVP 

which is considerably larger; e.g. for diatomic molecules (air) 
experiment, and kinetic theory as well, give y = J = 14. 

10. (V. p. 45.) Thermodynamics of irreversible processes 

Since I wrote this section of the text a new development of 
the descriptive or phenomenological theory has come to my 
knowledge which is remarkable enough to be mentioned. 

It started in 1931 with a paper by Onsager in which the 
attempt was made to build up a thermodynamics of irreversible 
processes by taking from the kinetic theory one single result, 
called the theorem of microscopic reversibility, and to show that 
this suffices to obtain some important properties of the flow of 
heat, matter, and electricity. The starting-point is Einstein's 
theory of fluctuations (see Appendix, 20), where the relation 
8 = klogP between probability P and entropy 8 is reversed, 
using the known dependence of S on observable quantities to 
determine the probability P of small deviations from equili- 
brium. Then it is assumed that the law for the decay of an 
accidental accumulation of some quantity (mass, energy, tem- 
perature, etc.) is the same as that for the flow of the same 
quantity under artificially produced macroscopic conditions. 
This, together with the reversibility theorem mentioned, 
determines the main features of the flow. The theory has been 
essentially improved by Casimir and others, amongst whom 
the book of Prigogine, from de Donders's school of thermo- 
dynamics in Brussels, must be mentioned. Here is a list of the 

L. Onsager, Phys. Eev. 37, p. 405 (1931); 38, p. 2265 (1931). 

H. B. G. Casimir, Philips Research Reports, 1, 185-96 (April 1946); 

Eev. Mod. Physics, 17, p. 343 (1945). 
C. Eckhart, Phys. Rev. 58, pp. 267, 269, 919, 924 (1940). 
J. Meixner, Ann. d. Phys. (v), 39, p. 333 (1941); 41, p. 409 (1943); 

43, p. 244 (1943) ; Z. phys. Chem. B, 53, p. 235 (1943). 
S. R. de Groot, ISEffet Soret, thesis, Amsterdam (1945); Journal 

de Physique, no. 6, p. 191. 
I. Prigogine, ^tude thermodynamique des phtnomenes irre'versiblea 

(Paris, Dunod ; Ltege, Desoer, 1947). 



11. (VI. p. 47.) Elementary kinetic theory of gases 

To derive equation (6.1) of the text, consider the molecules 
of a gas to be elastic balls which at impact on the wall of the 
vessel recoil without loss of energy and 
momentum. If the t/z-plane coincides 
with the wall the ^-component of the 
momentum mg of a molecule is changed 
into wf ; hence the momentum 2m is 
transferred to the wall. Let n v be the 
number of molecules per unit of volume 
having the velocity vector v. If one 
constructs a cylinder upon a piece of the 
wall of area unity and side v dt, all mole- 
cules in it will strike the part of the wall 
within the cylinder in the time-element dt\ the volume of the 
cylinder is dt, hence the number of collisions per unit surface 
and unit time, gn v and the total momentum transferred 2m 2 n v . 
This has first to be summed over all angles of incidence (i.e. 
over a hemisphere); the result is obviously the same as one-half 
of the sum over the total sphere, namely 

Fia. 9. 

where n v is the number of molecules per unit of volume, with a 
velocity of magnitude v (but any direction). Now the 'principle 
of molecular chaos' is used according to which 

P = ? = ~? = 

Hence the last expression is equal to n v lfi 9 and the pressure is 


finally obtained by summing over all velocities 





The total (kinetic) energy in the volume V is 

hence one obtains the equation (6.1) of the text, 

Vp = |Z7. 

Now one can apply the considerations of Appendix, 9, using the 
experimental fact expressed by Boyle's law (that all states of a 


gas at a fixed empirical temperature # satisfy pV = const.). 
Then one obtains 

pV = RT, U = \RT, (2) 

as stated in (6.2) of the text. 

12. (VI. p. 50.) Statistical equilibrium 

If H depends only on p, not on x, the equation [H, /] = 
reduces to Q - 

and is equivalent to the set of ordinary differential equations 

dx __ dy ___ dz 
Hx ~~ P y ~~ !>* 

By integrating these (p is constant) one obtains the general 
solution of (1) as an arbitrary function of the integrals of (2), 

), (3) 

where m = x A p. (4) 

Now if the gas is isotropic, / can depend only on p 2 and m 2 , 
and if it is to be homogeneous (i.e. all properties are independent 
of x), m 2 cannot appear; hence 

/ = fcfcpt) = ^(H), (5) 

as stated in the text. 

13. (VI. p. 51.) Maxwell's functional equation 

To solve the equation (6.10) it suffices to take 3 = 0; putting 


f(x+y) = #*#(?). (1) 

Differentiating partially with respect to x, 

f(x+y) = f (*tf<y), (2) 

and dividing by the original equation 

/'(s+y) = 



Now, the right-hand side is independent of y, and the left-hand 
side cannot therefore depend on y\ hence, putting x = 0, 


where j8 is a constant. By integration, 

f(y) = ae-P* = e~fr, (5) 

which is the formula (6.11) of the text. 

14. (VI. p. 52.) The method of the most probable distribu- 

We have to determine the probability of a distribution of 
equal particles over N cells, where n^ of them are in the first cell, 
n 2 of them in the second, etc. (n 1 +n 2 +...+n N = n). To do this 
we first take the particles in a fixed order; then the probability of 
distribution (n v n 2 ,..., n N ) is, according to the multiplication law, 

a>2 ......... O} N O) N ...W N = a^eog* ...$% 

where cu^a^,...,^ are ^ e relative volumes of the cells, nor- 
malized so that w l + fc>2 + + W N == 1 To obtain the probability 
asked for, we have to destroy the fixed order of the particles. If 
one performs first all n\ permutations, one gets too many cases, 
as all those distributions, which differ only by permuting the 
particles in each cell, count only once. Therefore one has to 
divide n\ by the number of all these permutations inside a cell, 
that is by n 1 !w 2 L..%! The total result is the expression (6.15) 

P (ni ,n 2 ,...,n N ) - -j-f^oM...^. (1) 

which is nothing but the general term in the polynomial expan- 

2 P(n v n 2) ...,n N ) =; V nl <*>?. 

ni...^ W ^-J n^^.n^. 

We now deal with the approximation of n! by Stirling's 
formula. The simplest way to obtain it is this: write 

= log(1.2.3...n) = log l+log2+log3+... -flog n 



and replace the sum J log fc by the integral 


j logic dx = 7i(logn 1). 

A more satisfactory derivation is the following: One can represent 
n! by an integral and evaluate it with the help of the so-called 
method of steepest descent, which plays a great part in the 
modern treatment of statistical mechanics due to Darwin and 
Fowler (see p. 54). The approximate evaluation of n! may serve 
as a simple example of this method. 
If the identity 

~-(e~ x x n ) = e- x x n +ne~ x x n - 1 

is integrated from to oo and the abbreviation (F-function) 


T(n+l) = j e~ x x n dx (n > -1), (2) 

used, one obtains T(n+l) = nT(n). (3) 

As F(l) = 1, one has 

T(2) = l.T(l) = 1, T(3) = 2F(2) = 1.2, 

r(4) = 3r(3)= 1.2.3, 
and in general T(n-\-l) = n\. (4) 

The integral (2) can be written 


r(n+ 1 ) = J e'C^ dx, f(n, x) = -x+n log x. (5) 

The function f(n, x) (hence also the integrand) has a maximum 

/'(*) = -1+ = 0, 

i.e. at x = n, and 

f(n) == n+nlogn, 


The expansion of f(x) in the neighbourhood of the maximum 
x = n is therefore 

f(x) = -n+nlogn (x- 

and one has 


n J 

where the dots indicate terms of higher order which can be easily 
worked out. If these are neglected the integral becomes 

J e -(x-n)'i2n dx = J e -t' l2n d 


for large n. Hence 

nl = T(n+l) = J(27rn)e- n n n +... (6) 

and log/i! = nlogn n-\-^log(27rn)-\- ..., (7) 

where the highest terms agree with the previous result. 
Thus the logarithm of the probability P can be written 

logP = 2^K)+const., (8) 


where <f> 8 (n 8 ) = ^(logo^-log/g. (9) 

(8) and (9) are, for equal co's, equivalent to formula (6.17) of 
the text. 

One has to determine the maximum of log P with the con- 
ditions (6.13), (6.14), of the text, namely 

N N 

IX = 71, 5> s e*= U. (10) 

81 S-l 

Without using the special form of <f> 8 , one obtains 

. I") 

where A, j8 are two Lagrangian factors. For the special function 
(9) one has 

l, (12) 

and if this is substituted in (11) with A+l = a, 
Iogn 8 = logco d +a p 8i 

n a = a> 8 e a -^; (13) 

that is, for equal CD'S, the formula (6.18) of the text. 


If one has two sets of systems A and JJ, as discussed in the text, 
there are three conditions 

V n^ = n^\ T n*> - T n^>4^>+ T n^cj^ = U, 

r-l r=l rl rl 

and therefore instead of two multipliers three, X< A \ X (B \ j8 ; and 
one obtains, with A^>+ 1 = a (A \ A< B >+ 1 = a (B \ the formulae 
(6.19) of the text, which show that ft is the equilibrium para- 
meter, a function of the (empirical) temperature # alone. 

In order to see that /S is reciprocal to the absolute tempera- 
ture one must apply the second theorem of thermodynamics, 
which refers to quasi-static processes involving external work 
(for instance by changing the volume). 

By an infinitely slow change of external parameters a l9 a&... 9 
the energies of the cells e r will be altered and at the same time 
the occupation numbers n r \ the total energy will be changed by 


while the total number of particles is unchanged, 

dw==2dn r = 0. (15) 


The first term in (14) represents the total work done 

dW=-n2f a da at (16) 


where / = -1 V n r ^ (17) 

n 44 da a 

is the average force resisting a change of a a . Then the second 
term in (14) d$ = 2,K (18) 


must represent the heat produced by the rearrangements of the 
systems over the cells. 

The corresponding change of logP is obtained from (8) 
and (11), 

dlogP = 

which in virtue of (15) and (18) reduces to 



This shows that fidQ is a total differential of a function depend- 
ing on j9,a l9 a 2 ,..., and that /J(#) is the integrating factor. 

Hence the second law of thermodynamics is automatically 
satisfied by the statistical assembly, and one has, with the 
notations of section V, 

dQ = Xd<(>, with A = |, < = logP; 

then (5.25) and (5.26) give, with C = 1/4, logO = 0, 0> = 1, 
and (5.27) , 

T = ~ 9 S-S = k<f> = klogP. (20) 


k is called Boltzmann's constant. 
Now the change of energy (14) becomes 


If one has a fluid with the only parameter a^ = F, the corre- 
sponding force is the pressure 

p = nf, = - *>> 

and one obtains the usual equation 

dU= -pdV+TdS. (23) 

Returning to the general expression (21) one sees easily that 
one can express all quantities in terms of the so-called partition 
function (or 'sum-over-states') 

Z = 2"r<r/K (24) 

For, from (10) and (13), 

n = e a o>e~-P* r = e a Z, 

r vp 

hence a = logn--logZ. 

Now one has, after simple calculation, from (19), (20) with 
(8), (9), (17) 

u eiogZ 

u = = -- - _ . 
n 3B ' 

} (25) 

S v ; 


and du = J f a da a + T ds. (26) 


The simplest thermodynamieal function, from which all 
others can be derived by differentiation, is the free energy, 

= -kTlogZ, 



df - dlo * Z 


while s = df/dT leads back to the second formula (25). 

The application to ideal gases may be illustrated by the simp- 
lest model where each particle is regarded as a mass point with 
coordinates x,y,z, momenta p x ,p y ,p z , and mass ra. Then, 
according to Liouville's theorem, one has to take as cells w 8 
elements of the phase space dxdydzdp x dp y dp s and replace the 
sums by integrals. The energy is (p%-\-p%+p%)/2m. Then the par- 
tition function (24) becomes 

Z = "' ^ I2mm+p ^ p ^ dxdydzdp x dp y dp s . 

The integration over the space coordinates gives F, the volume. 
If one puts <J(p/2m)p x = ,..., one has 

z = vl^J JJJ e-te+ 

the integration extended for each variable from oo to 
The integral is a constant which is of no interest as all physical 
quantities depend on derivatives of Z. Hence, with /? = (kT)~~ l , 

/= -kTlogZ = - 
from which one obtains 

p = - n = , s = - = 

= pT+const. 

These are the well-known formulae for an ideal monatomic gas: 
Boyle's law, the entropy and energy per atom. The specific 
heat at constant volume is 

* i 
= dT = ^ 

if n refers to one mole. 


15. (VI. p. 54.) The method of mean values 

The method of Darwin and Fowler aims at computing the 
mean value of any quantity f(n r ), depending on the occupation 
number n f of a cell for all possible distributions n lf n Zi ... 9 n N9 
satisfying the conditions 

that is, the quantity 

TOM = 2 p ( n *> *)/K)> (2) 

where P is the probability of the distribution n l9 w 2 ,..., n N) defined 
by equation (6.15) or 14, (1). 

We consider the function F(z) defined by (6.22), 

F(z) = w l #*+wt#*+. ..+*>##*, (3) 

and assume that a very small unit of energy is chosen so that all 
the r are positive integers, which may be ordered in such a way 
thatcj ^ e 2 ^ 8 ^ ... ^ e^; also, by choosing the zero of energy 
suitably we can arrange that e x = 0. 

Then we expand {F(z)} n into powers of z according to the 
multinomial theorem and obtain a series of terms 

n ! 

by collecting all these terms with the same factor z u we obtain all 
the P(n 1 ,7i 2 ,...,7i^ v ) which belong to the same value of 

U = I e r n r . 


Now we substitute 1 for each/(r& r ) in (2) and obtain in this 
way the total probability of these distributions which have a 
given total energy U, in the form: 

y P = coefficient of z u in {F(z)} n . 

This coefficient can be evaluated by Cauchy's theorem, if z is 
regarded as a complex variable; one has 

where the integral is taken round a closed contour surrounding 


the origin in the z-plane. The integral can be evaluated 
approximately by the method of steepest descent which we have 
already explained, for real variables, in Appendix 14, for the 
The first step is to express the integrand in the form 

} n = e<**\ O(z) = nlog F(z)-(U+l)logz. 

both log F(z) and its derivative increase monotonically from a 
finite value to oo as z moves along the real axis between and 
oo. Also -log z and its negative derivative z~ l decrease mono- 
tonically along the same path. Hence G(z) can have only one 
extremum, a minimum, on the real axis between and oo, and 
this minimum will be extremely steep if n and U are large. 

Also let z be the point of the real axis where the minimum 
happens to be; then at this point the first derivative of G(z) 
vanishes and the second is positive and very large. Hence in the 
direction orthogonal to the axis the integrand must have a very 
sharp maximum. If we take as contour of integration a circle 
about through z , only the immediate neighbourhood of this 
point will contribute appreciably to the integral. 

The minimum z is to be found as root of the equation 

and one has 

This shows that for large U and n a proportional increase in U 
and n will not change the root z , while G"(z ), which is positive, 
can be made arbitrarily large. 

5131 XT 


Putting z = z +iy one obtains for the integral (4) 


V P = JL e W> f e-Wto*i dy, 


where the terms of higher than second order are omitted and the 
limits of integration are taken to be 00 because of the sharp 
drop of the exponential function. This gives 

U+ 1 can be replaced by f/, because of the smallness of the energy 
unit chosen; if one puts 

*o = e+ (7) 

one has, for N -> oo, 

F(z ) = 2 "V# = 2 "re'*' = Z(/5), (8) 

which shows that the function F(z) is equivalent to the partition 
function introduced in (14.24), p. 158. 

If one now takes the logarithm of (6) the leading terms are 

On the other hand, one has from (5) to the same approximation 

p- _ , f'M _ *dZ_ dlogZ 

U ~ nz F(z ) ~ Zdp- n dp ' ( *> 

in agreement with (14.25); hence 


Comparison with (14.25) shows that the entropy in this theory 

is to be defined by 

= rw = HogYP, (11) 


while in the Appendix, 14, the definition was S = Mog P, where 
P means the maximum value of the probability. 

Thus it becomes clear that owing to the enormous sharpness of 
the maximum it does not matter whether one averages over all 
states or picks out only the state of maximum probability. 
In fact, the two methods, that of the most probable distribution 
and that of mean values, do not differ as much as it appears. 
Both use asymptotic approximations for the combinatorial 



quantities: either for each factorial in the probability before 
averaging, or for the resultant integral after averaging. The 
results are completely identical. Yet there are apostles and dis- 
ciples for each of the two doctrines who regard their creed as the 
only orthodox one. In my opinion it is just a question of training 
and practice which formalism is more convenient. The method 
of Darwin and Fowler has perhaps the advantage of greater 
flexibility. The partition function is nothing but a 'generating 
function' for the probabilities, and allows the representation of 
these by complex integrals. In this way the powerful methods 
of the theory of analytic functions of complex integrals can be 
utilized for thermodynamics. 

16. (VI. p. 56.) Boltzmann's collision integral 

The collision integral (6.24) can be derived in the following way. 

The gas is supposed to be so 
diluted that only binary en- 
counters are to be taken into 
account. Then the relative 
motion of two colliding particles 
has an initial and a final straight 
line asymptote. 

To specify an encounter we 
define the 'cross-section* as the 
plane through a point with a 
normal parallel to the relative velocity l^ 5 2 of two particles 
before an encounter and introduce the position vector b in this 
plane. We erect a cylindrical volume element over the area db 
with the height |i~ 2! &t\ then all particles in this element 
having the relative velocity 5x 2 w ^ P ass through db in time 
dt. The probability of a particle 2 passing a particle 1 at 
within the cross-section element db is obtained from the 
product /(I) dx 1 d? 1 /(2) dx 2 d 2 by replacing dx 2 by the volume 
of the cylinder |i 2! dbdt, 

Fia. 10. 

Every encounter changes the velocities and removes therefore 
the particle 1 from the initial range. The total loss is obtained 
by integrating over all db and d 2 : 



But there are other encounters such that the final state of the 
particle 1 is in the element dx l d^ l ; they are called inverse en- 

If the final velocities of the direct encounters are i, ?' 2 the 
laws of collision (conservation of momentum and energy) allow 
one to express i, 2 in terms of x , 2 and two further parameters 
(the components of b in the cross-section plane). These relations 
are linear in i 2 and may be shortly written 

5i), (2) 

where 3f represents a 6 x 6 matrix. It is obvious that the solu- 
tions of these equations for 5i and 2 in terms of i and ! ' 2 must 
have the same form; that means that &~ l = 3C, so that J2 72 = 1 
and \y\ = 1 or 

' ' *5i% = *5id5.. (3) 

Further, the elementary theory of collisions (conservation of 
energy and momentum) implies 

ISi-SI = I5i-5l- (*) 

Hence the number of inverse encounters is 

where /'(I) means /($, x x ,i), 5i being the linear function of 

Combining (1) and (5), one obtains for the total gain of par- 
ticles (1) in dx l d^ lt per time-element dt, 

5 2 |dbd 2 . (6) 

This has to be equated to the change of /(I) calculated without 
assuming interactions, namely, 

k \ 

XidX^dt. (7) 

The results are the combined formulae (6.23) and (6.24) of the 

-ffr, /(I)] = JJ {/'( 



17. (VI. p. 57.) Irreversibility in gases 

Assuming no external forces, the Hamiltonian of a particle 

is H = -^~~P % > h^nce Boltzmann's equation (6.23), or (Appendix, 

16.8) reduces to 


If now the entropy is defined by (6.25) or, using the velocity 
instead of the momentum, by 

S=-kjjf(l)logf(l)dx l d$ l9 (2) 

one obtains 

and substituting df(I)fdt from (1) 

2- 0) 

Here the first integral can be written 

and transformed, by Gauss's theorem, into a surface integral 
over the walls of the container, 

where v is the unit vector parallel to the outer normal of the 
surface, dor the surface element. 

The inner integral is n times the mean value over all velocities 
of ! .v log /( 1 ), where n = J /( 1 ) d^ is the number density. But 
this average vanishes at the surface which is supposed to be 
perfectly elastic and at rest, external interference being excluded; 
for the numbers of incident and reflected particles with the same 


absolute value of the normal component of the velocity, |.v|, 
will be equal. 

Hence there remains only the second integral in (3). This can 
be written in four different forms, namely, apart from the one 
given in (3), where the factor !+/(!) appears, three others where 
this factor is replaced by l+/(2) or l+/'(l)or l+/'(2). For it is 
obvious that 1 and 2 can be interchanged as the integration is 
extended over both points in a symmetric way; and the dashed 
variables can be exchanged with the undashed ones as 

(see Appendix, 16.3, 4). Hence 
~=~l JJJJ {log/(l)+log/(2) 

x{/'U)/'(2)-/(l)/(2)}|5i-5|dMx I d5 1 d5, 


xl5x-l.ld5id5.dMx!. (4) 

Now log *).(*> /* is positive or negative according as /'(l)/'(2) 

/v 1 )/! 2 ) 

is greater or smaller than /(I)/ (2); it has therefore always the 
same sign as /'(l)/'(2) /(l)/(2), and one obtains 

the = sign can hold only if 

/'(l)/'(2) - /(l)/(2) (6) 

or log/'(l)+log/'(2) = log/(l)+log/(2). (7) 

One can express this also by saying that 

log/(l)+log/(2) (8) 

is a collision invariant. 

The mechanics of the two-body problem teaches that there are 
only four quantities conserved at a collision: the three com- 
ponents of the momentum wl^+wl^ and the total energy 
wf+im!. Hence log / must be a linear combination of these: 

log/ = -t0rop+Y.$. (9) 



This can also be written 

/ = e^-tf^-u)', (10) 

1 v 2 

where u = 5 y and a x = a - Y 

(10) shows that u is the mean velocity. For a gas at rest (in a 
fixed vessel) one has therefore u = 0, hence y = and 

This is the dynamical proof of Maxwell's distribution law, 

18. (VI. p. 60.) Formalism of statistical mechanics 

As said in the text, Gibbs's statistical mechanics is formally 
identical with Boltzmann's theory of gases if the actual gas is 
replaced by a virtual assembly of copies of the system under 
consideration. Hence all formulae referring to averages per 
particle (small letters) can be taken over if the word 'particle' 
is replaced by 'system under consideration'. One forms the 
partition function (14.24) or (15.8) F(z) = Z(p), z = e~0, and 
from that the free energy (14.27) 

from which all thermodynamical quantities can be obtained 
by differentiation: 

This formalism includes also the case of chemical mixtures 
where the number of particles of a certain type is variable. One 
has to know how the quantity Z depends on these numbers; 
then the chemical potentials, introduced in Appendix, 8, are 
obtained by differentiating /with respect to the concentrations. 
We shall mention only the method of the 'great ensemble' which 
can be used in this case. 

In the theory of non-ideal gases the Hamiltonian splits up 
into a sum 


H = 

and the partition function into a product 
Z = J<P J exp{-03/2m) J^pf) d Pl ...dp N x 

x J <w 


The first integral can easily be evaluated and gives 

hence one has z = (27fmkT?mQ} (4) 


Q = J <> / exp{-U(x v ...,* N )/kT} dx..^, (5) 

as in (6.35) of the text. 

The method of Ursell for the evaluation of this integral applies 
to the case where the potential energy is supposed to consist of 
interaction in pairs between the centres of the particles, 

U = *<>> 0<, = 0(r<,). (6) 


Then one can write 

where f ti = 1 c^*. (8) 

' The product (7) can be expanded into a series 

and the problem of calculating Q is reduced to finding the 
'cluster integrals' 

J- J fa dx v ..dx N9 J...J fafn dx v ..dx N9 ..., (10) 

which are obviously proportional to F^" 1 , F^" 2 ,... . Hence one 
obtains for QV~ N an expansion in powers of F" 1 which holds 
for small interactions (j8O^ small implies/^ small): 


Then (1) and (4) give 

/ = -trHlog(2 w wfc3 l )+lo g g), (12) 

I ^ ) 


-- 8 /-ftr 8k -- y * If /i-^+ 1? - \ as) 

p -- ^p - M -gy- - -7~V F + F* 7' ( } 

where Jl = ot/N, B = (of2p)/N 9 ... . That is the formula (6.36) 
given in the text. 

The actual evaluation of the cluster integrals is extremely 
difficult and cumbersome. The analytical properties of the power 


series (11) have been carefully investigated by J. Mayer, and by 
myself in collaboration with K. Fuchs. The theory has been 
generalized so as to include quantum effects by Uhlenbeck, 
Kahn, de Boer. Here is a list of publications: 

H. D. Ursell, Proc. Camb. Phil. Soc. 23, p. 685 (1927). 

J. E. Mayer, J. Chem. Phys. 5, p. 67 (1937). 

J. E. Mayer and P. J. Ackermann, ibid. p. 74. 

J. E. Mayer and S. F. Harrison, ibid. 6, pp. 87, 101 (1938). 

M. Born, Physica, 4, p. 1034 (1937). 

M. Born and K. Fuchs, Proc. Roy. Soc. A, 166, p. 391 (1938). 

K. Fuchs, ibid. A, 179, p. 340 (1942). 

B. Kahn and G. E. Uhlenbeck, Physica, 4, p. 299 (1938). 

B. Kahn, The Theory of the Equation of State. Utrecht Dissertation. 

J. de Boer and A. Michels, Physica, 6, p. 97 (1939). 

S. F. Streeter and J. E. Mayer, J. Chem. Phys. 7, p. 1025 (1939). 

J. E. Mayer and E. W. Montroll, ibid. 9, p. 626 (1941). 

J. E. Mayer and M. Goeppert -Mayer, Statistical Mechanics, J. Wiley 
& Sons, Now York (1940). 

J. E. Mayer, J. Chem. Phys. 10, p. 629 (1942). 

W. G. MacMillan and J. E. Mayer, ibid. 13, p. 276 (1945). 

J. E. Mayer, ibid. 43, p. 71 (1939); 15, p. 187 (1947). 

H. S. Green, Proc. Roy. Soc. A, 189, p. 103 (1947). 

J. de Boer and A. Michels, Physica, 7, p. 369 (1940). 

J. de Boer, Contributions to the Theory of Compressed Oases. Amster- 
dam Dissertation (1940). 

J. Yvon, Actualit6s8cientifiquesetindustrielles,2Q3,p. 1 (1935); p. 542 
(1937) ; Cahiers de physique, 28, p. 1 (1945). 

19. (VI. p. 62.) Quasi -periodicity 

The state of a mechanical system can be represented by a 
point in the 6-ZV-dimensional phase space p, q, and its motion by 
a single orbit on a 'surface' of constant energy in this space. 
Following this orbit one must come very near to the initial point; 
the time needed will be considerable, in the range of observa- 
bility. This is the quasi-period considered by Zermelo. Yet 
there are much smaller quasi-periods if one takes into account 
that all particles are equal and indistinguishable; the gas is 
already in almost the same state as the initial one if any particle 
has come near the initial position of any other. Then the orbit 
defined above is not closed at all, yet the system has performed 
another kind of quasi-period. These periods are presumably 
small; I cannot give a mathematical proof, but it seems evident 
from the overwhelming probability of distributions near the 


most probable one. Einstein has this quasi-period in mind. It 
is certainly extremely short in the scale of observable time- 
intervals, and one can therefore say that the representing 'point* 
sweeps over the whole energy surface if this point is defined 
without regard to the individuality of the particles (i.e. if an 
enormous number of single points corresponding to permutation 
of all particles are regarded as one point). 

20. (VI. p. 63.) Fluctuations and Brownian motion 

The statistical conception of matter in bulk implies that 
spontaneous deviations from equilibrium are possible. There are 
several different types of problems, some of them concerned 
with the deviations from the average or fluctuations found by 
repeated observations, others with actual motion of suspended 
visible particles the Brownian motion. 

The simplest case of fluctuations is that of density, i.e. of the 
number of particles in a small part cuF of the whole volume V. 
One has in this case two cells of relative size a> and 1 o>, and the 
probability of a distribution %,n a = nn ly is according to 
(6.15) or (14.1) 

The expectation value of n^ found by repeated experiments is 



According to the binomial theorem this reduces to 

n^ = na>, (2) 

as might be expected. 

In order to calculate nf we note that %(^i~ 1) can be found in 
exactly the same way as rT l9 namely, 


= 2 * 




whence n i( w i~ !) = n(n~l)a>*. (3) 


L = n(n l)co -j-7Uo, (4) 

so that the mean square deviation is 

a>). (5) 

If a* is a small fraction, one obtains the well-known fluctuation 
formula for independent events 

(S^ = ^. (6) 

This is directly applicable to the density fluctuation of an ideal 
gas and can be used to explain the scattering of light by a gas, 
as observed for instance in the blue of the sky (Lord Rayleigh; 
Atomic Physics, Appendix IV, p. 280). 

There are also fluctuations of other properties of a gas. As 
the state of a fluid is determined by two independent macro- 
scopic variables (p, V for instance), it suffices to calculate the 
fluctuation of one further quantity. The most convenient one 
is the energy. 

The following consideration holds, however, not only for ideal 
gases, but for any set of independent equal systems of given 
total (or mean) energy; it supposes only that the distribution 
is canonical. 

Then all averages can be obtained with the help of the par- 
tition function (14.24) or (15.8), 

- jL, w t* ' > r ~kif 
In particular the mean energy is (14.25) 

s CO- - 6 P * rri 11 rr 

7* r r 2 dlogZ 

tt ^^ rr- ^ --' = = - - 

^ CO- 6""' f Jw C&J3 


and the mean of its square 

2= Vco.e-^~ === T' 


Hence the mean fluctuation of energy 

= 4(f), 

dp \ Ju I 


or with (8) _ , 

<Ae)=-g. (10) 

If the mean energy is known as a function of temperature, 
hence of j8, one obtains its fluctuation by differentiation. For 
example, for an ideal gas one has u = c v T = (c v /k)f3~ l 9 hence 
(Ac) 2 = c v kT 2 = (k/c v )u 2 . Another application, to the fluctua- 
tion of radiation, is made in section VIII, p. 79. 

If one wishes to determine the fluctuations of a part of a body 
which cannot be decomposed into independent systems, these 
simple methods are not applicable. 

Einstein has invented a most ingenious method which can be 
applied in such cases. It consists in reversing Boltzmann's 
equation S = klogP, (11) 

taking 8 as a known function qf observable parameters, and 
determining the probability P from it, 

P = e 8 ' k . (12) 

Assume the whole system is divided into N small, but still 
macroscopic parts and At^ is the fluctuation of energy in one of 
them; then one has for the entropy in this part 

If the whole system is adiabatically isolated one has 

TAw i = 0. 

By adding up all fluctuations one gets for the entropy 

S = $-*y #+..., (13) 

where the abbreviations 


are used. According to the second law of thermodynamics one 
has for constant volume 

dS ^dU_ 8S_ = 1 


/ 1 r i . 


where c v = dU/dT is the specific heat for constant volume. 
Substituting (13) in (12) one obtains approximately 

P = P -y?fl; 

hence the mean square fluctuation of energy in one (macroscopic) 
cell is 

= -jjjlog J e-r* df, - , 

or (A^J 2 = kT*c v . (16) 

This result is formally identical with that for an ideal gas ob- 
tained above, yet holds also if c v is any function of T. 

In a similar way other fluctuations can be expressed in terms 
of macroscopic quantities. 

We now turn to the theory of Brownian motion which is also 
due to Einstein. His original papers on this subject are collected 
in a small volume Investigations on the Theory of the Brownian 
Movement, by R. Fiirth, translated by A. D. Cowper (Methuen 
& Co. Ltd., London, 1926) and make delightful reading. Here 
I give the main ideas of this theory in a slight modification 
formulated independently by Planck and Fokker. 

Let f(x, t) dx be the probability that the centre of a colloidal 
(visible) particle has an x-coordinate between x and x-\-dt at 
time t. The particle may be subject to a constant force F and 
to the collisions of the surrounding molecules. The latter will 
produce a friction-like effect; if the particle is big compared with 
the molecules, its acceleration may be neglected and the velocity 


component in the ^-direction assumed to be proportional to the 

force f = BF, (17) 

where B is called the 'mobility*. Apart from this quasi-continu- 
ous action, the collisions will produce tiny irregular displace- 
ments which can be described by a statistical law, namely, by 
defining a function <f>(x) which represents the probability for a 
particle to be displaced in the positive ^-direction by x during a 
small but finite interval of time r. 

Then one obtains a kind of collision equation (which is simpler 
than Boltzmann's in the kinetic theory of gases, as no attempt 
is made to analyse the mechanism of collision in detail) : The con- 

vective increase of f(x, t) in the time-interval T,T-/ = T( + ~-) 

at \dt ox / 

is not zero but equal to the difference of the effect of the collisions 
which carry a particle from x l to x and those which remove the 
particle at x to any other place x\ 

= ]{f(x-x')-f(x)}<f>(x')dx'. (18) 


tf>(x) may be normalized to unity and the mean of the displace- 
ment and of its square introduced by 



dx=l, x$(x) dx = x, x*<f>(x) dx == (Ax)*. 

oo oo 


Further it may be supposed that the range of <f>(x) is small; then 
one can expand/(a x') on the right-hand side of (18) and obtain 
a differential equation for/(x,f) which, with (17) and (19), can 
be written 

(20 > 

Let us assume that the irregular action of the collisions is 
symmetric in x t ^(x) = ^( x); then Aa; == 0. 


Consider first statistical equilibrium; then 

Now the coordinate x of the colloidal particle can be included in 
the total set of coordinates of the whole system, if a term Fx 
is added to the Hamiltonian, so that the canonical law of distri- 
bution contains the factor eP Fx ,/3 = 1/kT. Hence the solution 
of (21) must have the form 

/ = / e^*, (22) 

so that &L_ BF *J_ 

8x*~ P 8x' 
If this is substituted in (21), one finds 


We consider now the motion of the particles without an 
external field (F = 0), under the action of the collisions only. 
Then (20) reads ~ f ^ f 

i= D w < 24 > 

This is the well-known equation of diffusion. Einstein's main 
result consists in the double formula (23) which connects the 
mean square displacement with the coefficient of diffusion D and 
with temperature and mobility. 

If the particle is known to be at a given position, say x = 0, 
at t = 0, the probability of finding it at x after the time t is the 
following solution of (24): 

the mean square of the coordinate, or the 'spread* of probability 
after the time t is found by a simple calculation: 

2 = J x*f(x, t)dx = 2Dt, (26) 


which for t = T is equal to the mean square displacement (A#) 2 
given by (23). 
These formulae can be used in different ways to determine 


Boltzmann's constant k, or Avogadro's number N = R/k (where 
jR is the gas constant per mole), i.e. the number of molecules per 
mole. A static method consists in observing the sedimentation 
under gravity of a colloid solution; then F = -wgr, where ra is 
the mass of the colloid particle, and the number of particles 
decreases with height according to the law (22), which now reads 

n = 

In order to apply this formula one has to determine the mass. 
For spherical particles m = (47r/3)r 3 />, where p is the density and 
r the radius. The mobility of a sphere in a liquid of viscosity rj 
has been calculated by Stokes from the hydrodynamical equa- 
tions, with the result 1 

B = i- ; (27) 

67777?-' v ' 

hence it falls under gravity, F = mg = (47r/3)r 3 />0, with 
the velocity (17) 


As can be measured, r can be found, if p and rj are known, and 
finally m. 

Another method is a dynamical one. One observes the dis- 
placements Aa^, A# 2 ,..., of a single colloid particle in equal inter- 
vals r of time and forms the mean square (A#) 2 . Then using the 
same method as just described for determining the radius r, 
one finds B from (27) and then k from (23). 

In this way the first reliable determinations of N have been 
made. Among those who have developed the theory M. v. 
Smoluchowski has played a distinguished part, while the first 
systematic measurements are due to J. Perrin. 

A new and interesting approach to the theory of Brownian 
motion may be mentioned: J. G. Kirkwood, J. Chem. Phys. 
14, p. 180 (1946); 15, p. 72 (1947). 

21. (VI. p. 67.) Reduction of the multiple distribution 

The total Hamiltonian H N o(N particles can be split into two 
parts, the first being the Hamiltonian H N ^ of N I particles, 
the second the interaction of these with the last particle: 



where O (t<) is the external potential on the particle i and O<^> the 
mutual potential between two particles i and j. 

Now we apply the operator XN ^ ^e equation for the total 
system f 

^==[H N ,f N ]. (2) 

From (6.40) we have, for q = N 1, 

XN/N /jv-i* 
Hence ^/. ~ Q/ 


Here the first term on the right-hand side becomes 

since H N _ l does not depend on the particle N to which the opera- 
tor XN re; f ers Further, 


for if the integration XN ^ s performed, the result refers to values 
of f N at infinity of the x (Ar) and (Ar) respectively, and these vanish 
as there is no probability for particles to be at an infinite distance 
or to have infinite velocities. 

If all this is substituted in (2) one obtains 


Repeating the same process with XN~i>XN-2>"- one obtains the 
chain of equations (6.44), (6.45) of the text. 

22. (VI. p. 68.) Construction of the multiple distribution 

The fundamental multiplication theorem for non-independent 
events ,can be obtained in the following way. 

Any event of a given set may have a certain property A or 



not, A. If B is another property we indicate by A B those events 
which have both the properties A and B. 

Then all events can be split into four groups AB, AB, AB, 
AB, with the probabilities p AB , p AB) p AB , p^g. 

The probability of A is 

PA^PAB+PAB- _ __ (1) 

On the other hand, if A is known to occur, the cases AB,AB are 
excluded, hence the probability of B is 

= PAB ^P^B^ 


which is the multiplication rule; it reduces to the ordinary one 
for independent events if p B (A) does not depend on A and is 
equal to p B . 

This rule can be applied to a mechanical system of N particles 
in the following way. 

Let A signify that g particles are in given elements of phase 
space; the probability of A can be written 

PA = f q d^d^...dx^d^. (3) 

Let B mean that the element q+l is occupied. Then AB ex- 
presses that all q+ 1 elements are occupied, or 

PAB = /+i dx< 1 W5< 1 )...rfx^^x^%^ 1 >. (4) 

Hence p B (A) t the probability for the element q+l being occu- 
pied, if q particles are in given elements, is 

p (A) = ** = ldxte+Udftv+V. (5) 

PA fq 

If this is summed over all possible positions and velocities of the 
last particle (g+ 1), the result is equal to the number of particles 
excluding the q fixed ones, N q\ hence, with the normalization 
described in the text, (6.42) and (6.43), 

(N-q)f q = f g+l dx(*+d%*+ = X<H .i/ ff +i, (6) 

which is the formula (6.40) of the text. 

In order to construct the equation (6.44) for the rate of change 
of / fl , one has to introduce a generalized distribution function 
which depends not only on the position x and the velocity | 


but also on the acceleration yj of the particles; the probability 
for a set of q particles to be in the element 

shall be denoted by 

g q (t, XW 0), yjd),... 

One has obviously 

/<r (7) 

Now the motion of the molecules follows causal laws; hence 
the probability f q of a configuration in x, ^-space at a time t must 
be the same as that at the time t-\-8t of that configuration which 
is obtained from the first by substituting x^+^8t an 
for x<*> and <*>. 
Hence (7) leads to 

J ? J 


The integration in the first two terms can be performed with 
the help of (7); that of the last leads to the integral 

f q Jp*, (9) 

J ( .?! J 

where the symbol yj^ i s evidently the mean acceleration. 
Hence one obtains from (8) 

The final step consists in using the laws of mechanics for deter- 
mining rjp. The equations of motion are (force P (r) ) 

Now the function f q refers to the case where the positions and 
velocities of q particles are given, the others unknown. Hence 
one has to split the sum (11) into two parts, the first referring to 
the given particles, the second to the rest. For this rest the 


probability of finding a particle in a given element q+ 1 is known, 
namely (f q +i/f q ) dx ( + 1) dl- ( + 1) ; hence the average of this sum can 
be determined by integrating over dx ( <*+ 1) d ( < z+1) , i.e. by applying 
the operator x q +i- In this way the mean acceleration is found to be 

Substituting this in (10), one obtains 

/d(T)(t,2+l) fif \ 

* u ^ ' u jq+l\ MQ\ 

g x(< ) ~J9g(o) ^> 

which is easily confirmed to be identical with the formulae (6.44), 
(6.45) of the text. 

23. (VI. p. 69.) Derivation of the collision integral from the 
general theory of fluids 

From the standpoint of statistical theory a fluid differs from 
a solid by the absence of a long-range order, so that for two events 
A and B happening a long distance apart one has, with the 
notations of Appendix, 22, p AB = p A .p B i for instance, for large 
|x (2) ~-x (1) | one has / 2 (x^, x (2) ) = / l (x (1) )/ 1 (x (2) ), while in solids 
this is not the case. 

The distinction between liquid and gas is not so sharp and 
may even be said to disappear above the critical state. However, 
if one is not specially concerned with these intermediate con- 
ditions there is a wide region where liquid and gas can be dis- 
tinguished by the extreme difference of density. From the 
atomistic standpoint this has to be formulated thus: 

The potential energy <J>(x (t>) , x ( ^) between two molecules at 
x w and x f) decreases rapidly with the distance between their 
mass centres, and (except in the case of ions, where Coulomb 
forces act) a distance r , small by macroscopic standards, may 
be specified, beyond which the interaction may without error 
be assumed to vanish completely. In a liquid proper, there are 
many molecules within this distance r of a given molecule; in a 
gas there are usually none, and the probability that there is more 
than one is very small, except near condensation. The neglect 


of this small probability is equivalent to the assumption of 
'binary encounters' in gas-theory. Green has shown that when 
this assumption is made, on taking q = 1 in the equations (6.44) 
and (6.45) of the text, 

-[#i>/i]^> (1) 

one obtains Boltzmann's collision equations (6.46), (6.47). 

To prove this we first work out the expression S l using the 
definition (6.5) of the Poisson bracket and of the operator x> 

(see also 22.13). With the assumption of binary encounters / 2 
can be expressed in terms of / x by using the mechanical laws of 

Consider the motion of two molecules which at time t have 
positions x (l) , x (2) , such that |x< 2 > x (1) | < r , and velocities 
5 a) >? (2) j while at time t Q (< t), when the molecules were last at a 
distance r Q from another, their positions and velocities were 
x$\ x[> 2) and QftQ^. The configurational probability 

/,(*, x<!>, x< 2 >, 50), >) dJ^dj^d^^ 

must remain unchanged during the interval ( , t) as the motion 
follows a causal law; also, by Liouville's theorem, the volume in 
phase space dx (1) dx (2) cfl- (1 W 2) is unaltered. Since, as explained 
above, molecular events in fluids which occur beyond the range 
of interaction must be considered independent, one has 

Next one introduces an approximate assumption which is 
always made in gas-theory, that , x^, x|> 2) may be replaced by 
t, x (1) and x (2) on the right-hand side of (4) (but of course not 
?o 1) 5o 2) > by J* (1) , ^ (2) ). As r Q is very small the resulting error is of 
microscopic order; nevertheless it is not without importance, 
for it allows small deviations from Maxwell's velocity distribu- 
tion law (and other 'fluctuations'), which would otherwise be 
unexplainable, as this law is a rigorous consequence of Boltz- 
mann's collision equation in equilibrium conditions. 

It remains to calculate 1* j, 1} and |& 2) in terms of a) , (2) and 


r = x (2) -x (1) , which can be done by using the canonical equations 
of motion or their independent integrals (conservation of energy, 
momentum, and angular momentum). The resulting formulae 
are the same as used in Boltzmann's theory (see Appendix, 16). 
The reduction of $ 2 can be performed without making use of 
explicit expressions. One has only to remark that 

now satisfies the equation 

[# 2 ,/ 2 ] = 0, (5) 

where # a = f (5 (1)2 +5 (2)i )+<I>(r) (6) 


is the Hamiltonian of the two particles which are considered to 
move independently of all the others. Now (5) becomes 

We integrate this over dx (2 >cfl; {2) ; then the term with d/ 2 /| (2) on 
the right-hand side vanishes, because there are no particles with 
infinite velocities. The other term, with #/ 2 /#! (1) , becomes identi- 
cal with mSfr according to (3), since 

ao ao 

Hence, with (4), 

^ = JJ ( 5 (2) ~S (1)) ' ^{A(5i 1} )/i(?o 2) )} ctafl5, (8) 

where the domain of integration over r may be limited by the 
sphere of radius r surrounding x (1 >. 

This integration can be performed by imagining the sphere to 
be partitioned by elementary tubes parallel to the relative 
velocity J- (2) -- (1) ; one may then integrate, first over a typical 
tube specified by the cross-section radius b, perpendicular from 
the centre of the sphere to the tube (see Appendix, 16), and then 
over all values of b. At the beginning of the tube, where 

the interaction between the molecules is negligible, and the 
functions giving 5i 1) and 2) in terms of (1) , (2) , and r reduce to 


5 and < 2 >. At the end of the tube the values $' and 2 >' of these 
functions have to be calculated from the collision integrals, just 
as in Boltzmann's theory. Thus one obtains 

>flg, (9) 

which is identical with (6.47) of the text and the collision integral 
in (16.8). 

This derivation is not more complicated than Boltzmann's 
original one' and is preferable because it reveals clearly the 
assumptions made. 

24. (VII. p. 72.) Irreversibility In fluids 

A rigorous proof of the irreversibility in dense matter from the 
classical standpoint seems to be very difficult, or at least ex- 
tremely tedious. Green has, however, suggested a derivation 
which, though not quite rigorous, is plausible enough and 
certainly based on reasonable approximations. 

It has tojbe shown that the entropy 8 defined by (7.1) never 
decreases in time, so that 

satisfies the equation 

1 ), (2) 

which expresses that one particle of unknown position and 
velocity is added to a system of N particles. 

If <I> is the total potential energy between the N particles and 
$&w-i) that between the ith of these and the additional particle, 
one has 

8t ~ 

<-l i-l 

If this is substituted in (1) the integrals of the first two terms 


vanish on transformation to surface integrals. In the last sum 
all terms contribute the same, as/^ a,ndf N+1 can be assumed to 
be symmetric in regard to all particles. Hence 

Now the reasoning follows very closely that of Appendix, 23, 
where (23.3) was transformed into the integrable expression 
(23.8) with the help of the identity ( 23.7). 

For this purpose one introduces instead of the velocities of the 
two particles (1) and (^+1) appearing explicitly in (4) new 
variables, namely their total momentum m, two components 
of the relative angular momentum a, and the relative energy w, 

m = m( 1 >+^+ 1 >), a = m(x^+ 1 >-x< 1 >) A 

w = 

and regards f N+l as a function of these, so that 

/v+i = f K+ i(t, * (1 >,..., XW+ 1 ', (..., ?w, m, a, w). 
Then by direct differentiation it can be verified that 

_ / g/^ a/ A7+1 \ 

"" vs ' s ; '\ax^ +1 > ax^ +1 v' 

an equation similar to (23.7). 

^f ^A(i-^+i) 

If JN+I u - j s taken from it and substituted in (4), the 
dt> (l) dx (1) 

only term which does not vanish is found to be 

x (OT-D_(D) . rfx0^^...rfx^+ 1 5^+ (6) 

m,a,t(7 are parameters specifying the trajectories which would 
be followed by the particles numbered (1) and (N-\- 1) if no other 
particles were present. Now one can apply the same reasoning 
as in Appendix, 23, partitioning the x (jv+1) -domain by tubes 
formed by the trajectories of (jV+1) relative to (1), where 
m, a, w are constant, and one can perform the integration with 


respect to x ( ^ +1) first along such a tube, then over all values of 
the cross-section b. At each end of the trajectory where the 
interaction O C1 - JV>+1) can be neglected, the function f N+1 would 
factorize into fi N+1) f N , provided no other particle were near to 
the particle (JV'+l). 

This is, of course, not the case; but it seems to be reasonable to 
assume that the factorization is at least approximately correct 
as the action of the rest will nearly cancel. This is the simplifica- 
tion made by Green. It is clear that it could be corrected by a 
more detailed consideration; but let us be content with it. 

Since the sphere around x (1) in which O (t JV>+1) is effectively 
different from zero is of microscopic dimensions, the values of 
X UV+D an( j x (i) nee( i no t be distinguished, nor the instants when 
these points are reached. The initial velocities J- (1)/ , 5 (jv+1) ' must, 
however, be determined from the actual final velocities from the 
'conservation' law, i.e. the definitions (5) for constant m, a,w: 

m(x< A 5 (1 >' 

If the integration in (5) is performed as described, one obtains 

=m J (2 " t2) J 

x ||(tf+D_|(D| dbdxdx^\..dx^d^\..d^^\ (7) 

where instead of x (1) the centre x = |(x (1) +x (jv+1) ) is introduced. 
Here/^ 4 " 1} means/ 1 (x (jv+1) , l- (Ar+1) ) which can be replaced, accord- 
ing to formula (6.40) of the text, by 

1 f f 

If this is introduced into (7) one has an integral over 
variables, where the integrand contains the factor f^fz-/^/^ 
By repeating the procedure one can transform (7) into the 

f - pr^p / J 

x | 


where F N is the function obtained by replacing the variables 
x<*> and % iuf N by x<*+^> and 5<'+^> respectively. 

Now one can apply the same transformations as for gases, 
as explained in Appendix, 17, which lead from (17.3) to (17.4), 
exchanging the dashed and undashed variables, and exchanging 
the two groups (1,2,...,^) and (1+N 9 2+N 9 ...,2N). As it is 
obvious that the integral is invariant for these changes, one 

f - pra J '"' J "*(f )<^-/n> x 

x |5v+i>_5<i>| dbdJid^...d^^d^ N +^...d-^^d^...d^>, (8) 

which makes it clear that dS/dt is positive or zero, and that the 
latter happens only if 

f N F N =f' N F' N . (9) 

The solution of this equation leads again essentially to the 
canonical distribution. I shall, however, not reproduce the 
derivation but refer the reader to the original papers: 

M. Born and H. S. Green, Nature, 159, pp. 251, 738 (1947). 
-- p roCt R y t Soc. A, 188, p. 10 (1946); 190, p. 455 (1947); 

191, p. 168 (1947); 192, p. 166 (1948). 
H. S. Green, ibid. 189, p. 103 (1947); 194, p. 244 (1948). 

The reader may compare this involved and, in spite of the 
complication, not quite rigorous derivation from classical theory 
with the simple and straightforward proof from quantum theory 
given in section IX. 

I wish to add an argument, also due to Green, which shows 
that once the increase of entropy is secured the distribution 
approaches the canonical one. The latter is given by 

flr-e-*. a = ^. = -!, (10) 

A is the free energy and E the energy, given by 
E = Jm f ($ (i) -u<*>)2 


(i) being the macroscopic velocity at the point x (i) . 
Let the actual distribution be 


one has 

jjf N dxd% = Nl, jjf N E dxd% = Nl 7, 

where U is the internal energy, and the same holds, of course, for 
/^, so that 

jjf' N dxd$ = 0, jjf' N Edxd$ = 0. (13) 



= ~Fi JJ {^ 

Here the terms linear in f' N vanish in virtue of (13), and one 
obtains * 

This shows that an increase in the value of S requires a decrease 
in the average value of \f' N \ and therefore an approach to the 
canonical distribution. 

25. (VIII. p. 75.) Atomic physics 

It seems impossible to supplement this and the following 
sections, which deal with atomic physics in general, by appen- 
dixes in the same way as before. The reader must consult 
the literature; he will find a condensed account of these things 
in my own book Atomic Physics (Blackie & Son, Glasgow; 
4th edition 1948), which is constructed in a similar way to the 
present lectures; the text uses very little mathematics, while a 
series of appendixes contain short and rigorous proofs of the 
theorems used. For instance, Einstein's law of the equivalence 
of mass and energy is dealt with in Chapter III, 2, p. 52, 
and a short derivation of the formula = me 2 given in A. Ph. 
Appendix VII, p. 288. Whenever in the following sections I 
wish to direct the reader to a section or appendix of my other 
book, an abbreviation like (A. Ph. Ch. III. 2, p. 52; A. VII, 
p. 288) is used. 


26. (VIII. p. 77.) The law of equipartition 

If the Hamiltonian has the form (8.5), or 

= p, (1) 

where H' does not depend on , one has for the average of in a 
canonical assembly 

J e~ 

/ _ 

as all other integrations in numerator and denominator cancel. 
Now this can be written 


c = log Z, Z = f e~P dg. (2) 

d P -oo 

If the integration variable rj = ^(j8a/2) is introduced one gets 

Z = J8-U, 
where A is a constant. Hence logZ = const. log^3 and 

e = = ifc? 7 (3) 

in agreement with (8.6). 

27. (VIII. p. 91.) Operator calculus in quantum mechanics 

The failure of matrix mechanics to deal with aperiodic motions, 
continuous spectra, was less a matter of conception than of 
practical methods. An indication of using integral operators 
instead of matrices is contained in a paper by M. Born, 
W. Heisenberg, and P. Jordan, Z.f. Phys. 35, 557 (1926), which 
follows immediately after Heisenberg's first publication. The 
idea that physical quantities correspond to linear operators in 
general acting on functions was suggested by M. Born and 
N. Wiener, Journ. Math, and Phys. 5, 84 (1926) and Z. f. Phys. 
36, 174 (1926), where in particular operators of the form 

were used. Here the kernel q(t,s) is a 'continuous matrix', also 


introduced by Dirac. This paper contains also the representation 
of special quantities by differential operators (with respect to 
time) which satisfy identically the commutation law between 
energy and time EttE = ih. 

SchrOdinger's discovery, which was made quite independently, 
consists in using a representation where the coordinates are 
multiplication operators and the momenta differential operators, 
so that the commutation laws 

are identically satisfied. This opened the way to finding the 
relation between matrix mechanics and wave mechanics and 
to the later development of the general transformation theory of 
quantum mechanics which is brilliantly represented in Dirac's 
famous book. 

The early development of quantum mechanics as represented 
in text-books has become rather legendary. To mention a few 
instances: the matrices and the commutation law [q,p] = 1 
which are -traditionally called Heisenberg's, are not explicitly 
contained in his first publication: W. Heisenberg, Z. f. Phys. 

33, 879 (1925); his formulae correspond only to the diagonal 
terms of the commutator. The complete formulae in matrix 
notation are in the paper by M. Born and P. Jordan, Z.f. Phys. 

34, 858 (1925). Further, the perturbation theory of quantum 
mechanics, traditionally called Schrftdinger's, is contained 
already in the next publication of Heisenberg, Jordan, and 
myself (quoted at the beginning), not only for matrices, but also 
for vectors on which these matrices operate, and not only for 
simple eigenvalues, but also degenerate systems. The only 
difference of SchrOdinger's derivation is that he starts from a 
representation with continuous wave functions which he aban- 
dons at once in favour of a discontinuous one (by a Fourier 
transformation) . 

28. (IX. p. 94.) General formulation of the uncertainty 

The derivation of the most general form of the uncertainty 
principle can be found in my book (A. Ph. A. XXII, p. 326). 
As it is fundamental for the reasoning in these lectures, I shall 
give it here in a little more abstract form. 


We assume that for a complex operator C = A+iB and its 
conjugate C* = AiB the mean value of the product CC* is 
real and not negative: 

OC*">0, (1) 

where the bar indicates any form of linear averaging, as described 
in the text. Then writing XB instead of B, where A is a real para- 
meter, one has 

(A+iXB)(A-i\B) = A z +BW-~ri[A, B]X > 0, (2) 
where the abbreviation (9.4) 


is used. As the left-hand side of (2) is real and also the first two 
terms on the right, it follows that [A, B] is real. The minimum 
of the quadratic expression in A, given by (2), occurs when 

and it is equal to 

Hence * . JS* ^ [A,B]*. (4) 

Now replace A by A A and BbyBB. A&A,B are numbers 
and commute with A and B, the commutator [A, B] remains 
unchanged. Putting, as in (9.2), 

8.4 = {(A-A)*}*, SB = {CB 2 }*, 
one obtains from (4) the formula (9.3) of the text, 


and as [q,p] = 1, especially (9.5), 

8^.8g>|. (6) 

This derivation reveals the simple algebraic root of the uncer- 
tainty relation. But it is not superfluous at all to study the 
meaning of this relation for special cases; simple examples can 


be found in A. Ph. A. XII, p. 296, A. XXXII, p. 357, and in 
many other books, for instance, Heisenberg, The Physical 
Principles of the Quantum Theory. 

29. (IX. p. 97.) Dirac's derivation of the Poisson brackets 
in quantum mechanics 

It is fashionable to-day to represent quantum mechanics in an 
axiomatic way without explaining why just these axioms have 
been chosen, justifying them only by the success. I think that 
no real understanding of the theory can be obtained in this way. 
One must follow to some degree the historical development and 
learn how things have actually happened. Now the decisive 
fact was the conviction held by theoretical physicists that many 
features of Hamiltonian mechanics must be right, in spite of the 
fundamentally different aspect of quantum theory. This con- 
viction was based on the surprising successes of Bohr's principle 
of correspondence. In fact, the solution of the problem consisted 
in preserving the formalism of Hamiltonian mechanics as a whole 
with the only modification that the physical quantities are to be 
represented by non-commuting quantities. 

If this is accepted, there is a most elegant consideration of 
Dirac which leads in the shortest way to the rule for translating 
formulae of classical mechanics into quantum mechanics. It 
starts from the fact that classical mechanics can be condensed 
into the equation 

0, (1) 

which any function f(t,q,p) representing a quantity carried by 
the motion must satisfy. Here the Poisson bracket is used 

If (1) is to be generalized for non-commuting quantities, it is 
necessary to consider how the Poisson bracket should be trans- 
lated into the new language. 

Dirac uses the fact that these brackets have a series of formal 
properties, namely 


where c is a constant ; further 

and finally ' i (5) 

Here the factors are written in a definite order, though in 
classical mechanics this does not matter. We have t f o do so if we 
want to use these expressions for non- commuting quantities, 
and the rule followed is simply to leave the order of factors un- 

The question is, What do the brackets mean in this case ? To see 
this, we form the bracket [ x 2 > ?h ^2] * n ^ wo ways, using the two 
formulae (5) first in one order, then in the opposite one. Then 

and in the same way 

Equating these two expressions one obtains 

[l> >?l](f 2 ^2 ^2 f 2) = (^1 ^l ^1 f l)[f 2 ^2]- ( 6 ) 

As [g v T^J must be independent of 2 , ^ 2 an( i y i ce versa, it follows 

where A is independent of all four quantities and commutes with 
f i ^i~~ ^i f i an( i 2 7 ?2~~' ^2 ^2* Hence A is a number. That it must 
be purely imaginary, A = ifi, cannot be derived in such a formal 
way; but it follows from considerations like those used in the 
previous appendix, where it is shown that a reasonable definition 
of averages implies (28.3) that 

[f,ij] = -^-ij (8) 

is real.. Thus it is established that the Poisson brackets in 


quantum theory correspond to properly normalized commu- 

If one inserts in the classical expression (2) for and 17 a 
coordinate or a momentum, one finds 

[rffa] = > [Pr*A] = > [?r>A] = 8 rs> () 

where . = 1, S ra = for r ^ s. 

The same relations (9) must be postulated to hold in quantum 
mechanics. In this way the fundamental commutation laws are 

30. (IX. p. 100.) Perturbation theory for the density matrix 

We consider the problem of solving the equation 

^ = [#, P ], H = H +V, (1) 

where the perturbation function F is small. 

The method is essentially the same as that used for the 
corresponding problem in matrix or wave mechanics. 

Assume that A represents the eigenvalues of a complete set 
of integrals A of H , so that [H , A] = and H becomes 
diagonal in the A-representation; put 

S (A, A) = E, H (A', A') = E' 9 while H (A, A') = for A ^ A'. (2) 
Introduce instead of p and V the functions a and U given by 

Then one has 

F(A,A') = '- 

i 8t (i dt 

(H p~ P H )(\,X f ) = (JS/~^)a(A,A' 
Hence the equation (1) reduces to 

- I {U(\ 9 A>(A", A')-a(A, A")?7(A^, A')}. (4) 

Now assume that a is expanded in a series 

a = C7 + C7 1 + C7 2 +..., (5) 

5131 Q 


where CT O is diagonal and independent of the time and a v <r 2 ,... 
of order 1, 2,... in the perturbation. Then one obtains 

hence ^(A, A') = *(A, A'){a (A')-*oW} (6) 


t t 

u(\ A') = 1 f Z7(A, A') dt = | f F(A, A')e^/^r^ <fc. (7) 
ft J J 

o o 

It follows for the diagonal elements from (6) that 

a!(A,A) = 0. (8) 

The next approximation cr 2 has to satisfy the equation 

", A')} 

We need only the diagonal elements; for these one has 

>(A', AXa (A')-a (A)}, 

which gives by integration 

a a (A, A) = | M (A, A') | 2 {a (A')-a (A)}, (9) 

since, according to (7), u(A,A') is hermitian, u(A,A') = w*(A',A), 
and vanishes for t = 0. 

It is seen from (3) that the diagonal elements of p and cr are 
identical; they represent the probability P(,A) of finding the 


system at time t in the state A. Now (5), (8), and (9) give, in agree- 
ment with (9.25) of the text, 

P(t,X) = P(A)+ 


f F(A,A' 



When F(A, \') is independent of the time one can perform the 
integration, with the result 

J(M') = KlW 


Now the function 





behaves for large t 

like a Dirac 8-function, i.e. one has 


if the interval of integration Ay includes y = and if Ay ;> 1. 
Suppose that the energy values are distributed so closely that 
they are forming a practically continuous spectrum. Then one 
can split the index A into (A, E) and replace the simple summation 
in (10) by a summation over A and an integration over E', the 
latter can be performed on the coefficients J(\,E\ A', JE") with 
the result that the formula (10) is unchanged, if the coefficients 
are given by 

which combines the equations (9.27) and (9.28) of the text. 

As mentioned in the text, Green has found a formula which 
allows one to calculate the higher approximations in a very 
simple way. This formula is so elegant and useful that I shall 
give it here, though without proof (which can be found in the 
Appendix I, p. 178, of the paper by M. Born and H. S. Green, 
Proc. Roy. Soe. A, 192, 166, 1948). Starting from the equation 
(7) or _ , _ 


where U is known for a given perturbation V by (3), one forms the 
successive commutators 

u 22 = uu uu, i} 28 = u^n uu 22i ,..., (15) 

and from them the expansion 

If the initial condition u 2 = for t = is added, u% can be deter- 
mined by integrating this series term by term. 
Then one forms 

and the expansion 

from which one can determine % so that u% = for = 0. 

The second suffix Z in ?% has been chosen to indicate the power 
of F which is involved in the expression; one has u% = (^(F 2 *"" 1 ) 
and this decreases rapidly with k when F or t are small. This rule 
makes it possible to construct t* 4 , %,..., in a similar manner. 
Then one has the solution of (4) 

a = e u e u *e u *...p Q ...e- u *e- u *e- u (19) 

from which p is obtained by (3). 

The explicit expressions for the expansion (5) of a are 


These formulae will be useful for many purposes in quantum 
theory. Concerning thermodynamics, the third-order terms will 
have a direct application to the theory of fluctuations and 
Brownian motion. The customary theory derives these pheno- 
mena from considerations about the probability of distributions 
in an assembly which differs from the most probable one. The 
theory described here deals with one single system with the 
methods of quantum mechanics (which allows anyhow only 


statistical predictions); deviations from the average will then 
depend on higher approximations. It can be hoped that this idea 
leads to a new approach to the theory of fluctuations in quantum 

31. (IX. p. 112.) The functional equation of quantum 

The equation (9.37), 

where A (1) depends on E^ l \ but not on E ( *\ and A (2) vice versa, is 
obviously of the form 

f(x+y) = tWd/) 

treated in Appendix, 13, and has as solution general exponen- 
tial functions; hence the distribution for all three systems is 

32. (IX. p. 113.). Degeneration of gases 

The theory of gas degeneration is treated in my book Atomic 
Physics, but in a way which appears not to conform with the 
general principles of quantum statistics as explained in these 
lectures. According to these one always has in statistical equili- 
brium canonical distribution, P = e*~& E , while the presenta- 
tion in A. Ph. gives the impression that, by means of a modified 
method of statistical enumeration, a different result is obtained. 
This impression is only due to the terms used, which were those 
of the earlier authors (Bose, Einstein, Fermi, Sommerfeld), while 
in fact there is perfect agreement between the general theory 
and the application to gases. A simple and clear exposition of 
this subject is found in the little book by E. SchrOdinger, 
Statistical Thermodynamics (Cambridge University Press, 1946). 
I shall give here a short outline of the theory. 

In classical theory an ideal gas is regarded as a system of inde- 
pendent particles. In quantum theory this is not permitted, 
because the particles are indistinguishable. If ^(x* 1 *) and 
^r 2 (x (2) ), or shortly ^j(l) and 2 (2)> are the wave functions of two 
identical particles with the energies E l and E 2 , the SchrCdinger 
equation for the system of both particles, without interaction, 
has obviously the solution ^i(l)0 2 (2) with the energy E^+E^ 
but as the particles are identical there is another solution 


belonging to the same energy, namely M2)M1)- Hence any 
linear combination of these is also a solution. Two of these, 
namely the symmetric one and the anti-symmetric one, 

&(1, 2) = Ml W(2)+&(2)&(1), 
^1,2) = ^1)^2)-M2)^(1) 

have a special property: the squares of their moduli, |/r g | 2 and 
|j/rj 2 , are unchanged if the particles are interchanged. One can 
further show that they do not 'combine', i.e. the mixed inter- 
action integrals (matrix elements) vanish, 

0, (2) 

for any operator A symmetric in the particles. Hence they repre- 
sent two entirely independent states of the system; each state 
being characterized by two energy-levels of the single particle 
occupied, without saying by which particle. 

The same holds for any number of particles. If E l ,E 2 ,...,E n 
are the energies of the states of the isolated particles, the total 
system (without interaction) has not only the eigenfunction 
1/^(1)^2(2). ..$ n (ri) belonging to the energy E l +E 2 +...+E n but 
all functions P/r 1 ( 1 )<A 2 (2). - -^n( n ) > where P means any permutation 
of the particles, hence also all linear combinations of these. 
There are in particular two combinations, the symmetric one 
and the antisymmetric one, 

Ml, 2,..., n) = P 

(+ for even, for odd permutations), 

which have the same simple properties as described in the case 
of two particles: i/j 8 remains unaltered when two particles are 
exchanged, while i/j a (which can be written as a determinant) 
changes its sign; hence |</rj 2 and |^ a | 2 remain unchanged. 
Further, the two states do not combine, a fact expressed by 
formula (2). 

The functions if/ 8 and i// a describe the state of the n-particle 
system in such a way that the particles have lost their indi- 
viduality; the only thing which counts is the number of particles 
having a definite energy-level. 

Experiment has shown that this description is adequate for all 


particles in nature; every type of particle belongs either to one 
or the other of these two classes. 

The eigenfunctions of electrons belong, in view of spectroscopic 
and other evidence, to the antisymmetric type ; hence they vanish 
if two of the single eigenfunctions ^ a (/J) are identical, i.e. if two 
particles are in the same quantum state. This is the mathe- 
matical formulation of Pauli's exclusion principle. Nucleons 
(neutrons or protons) and neutrinos are of the same type; one 
speaks of a Fermi-Dirac (F.D.) gas. Photons and mesons, 
however, and many nuclei (containing an even number of 
nucleons) are of the other type, having symmetric eigen- 
functions; they form a Bose-Einstein (B.E.) gas. 

In both cases the total energy may be written 

E = 

where v e 2 ,... are the possible energy -levels of the single particles 
and n l9 n 2 ,... integers which indicate how often this level appears 
in the original sum E l -\-E^-\-...-\-E n (where each E k was attri- 
buted to ofie definite particle). 

The sum of these occupation numbers n v n 2 ,... 

... = 2,n 8 = n (5) 

may be given or it may not. The latter holds if particles are 
absorbed or emitted, as in the case of photons. For a B.E. gas, 
including the case of photons, there is no restriction of the n 8 , 
while for a F.D. gas each energy-value 8 can only appear once, 
if it appears at all. Hence one has the two cases 

. (B.E.) ^ = 0,1,2,3,... 

(F.D.) n. = 0,l. 

Now we apply the general laws of statistical equilibrium, which 
have to be supplemented by the fundamental rule of quantum 
mechanics that each non-degenerate (simple) quantum state has 
the same weight. (This is implied by the equation (9.14) of the 
text which shows that the diagonal element of the density matrix 
determines the number of particles in the corresponding state.) 
As we have seen in Appendix, 14, it suffices to calculate the 
partition function Z (14.24), p. 158, with all a) 8 = 1, 

Z= 2 e-t*to>** 9 (7) 


where the sum is to be extended over all quantum numbers n 8 , 
which describe a definite state of the system. These are just the 
numbers introduced in (4), with or without the restriction (5) 
according to the type of particle. Introducing the abbreviation 

z s = e-fc ' (8) 

one has 

z= 2 ^5... = 2? I 24 t -" = n2*- () 

ni,n 2 ,... wi n* a n* 

The sum is easily evaluated for the two cases (6), 

(F.D.) 2?- =l+v 


One can conveniently combine the results into one expression 

where the upper sign refers throughout to the B.E.'case. 

This formula contains the theory of radiation, where the 
condition (5) does not apply. But it is more convenient to deal 
with the instance where (5) holds and to relax this condition in 
the final result. 

A glance at the original form (9) of Z shows that the condition 
(5) indicates the selection from (10) of those terms which are 
homogeneous of order n in all the z s . 

This can be done by the method of complex integration. We 
form the generating function 

/(O ' 

and expand it in powers of . The coefficient of n is obviously 
equal to the product (9) with the restriction (5). Hence we obtain 
instead of (10) 


where the path of integration surrounds the origin in such a way 
that no other singularity is included except f = 0. 

For large n this integral can be evaluated by the method of 
steepest descent. It is easy to see that the integrand has one and 
only one minimum on the real positive axis. 


As in previous cases (see Appendix, 14, 15) the crudest 
approximation suffices. One writes the integrand in the form 
e*, where ^ = _ (n+1)log + l og /(0, 

and determines the minimum of the function gr() from 

0; (13) 

then one has to calculate g() and 

for the value of which is the root of (13). To a first approxi- 
mation one finds 

z = -- 

logZ = Mn+l)log+log/(0-llog{27j0'(0}. 

Neglecting 1 compared with n and the last term (which can 
be seen to be of a smaller order), one obtains 

logZ=-nlogM-log/(C); (14) 

here is the root of (13), where also n-\- 1 can be replaced by n. 
Now one gets from (11) 

log/(0 = 

Hence, from (8) and (13), 

From this equation a (or f ) can be determined as function of 
the particle number and of temperature. One easily sees now 
that the case where the number of particles n is not given is 
obtained by just omitting the equation (16) and putting a = 
or ? = 1. Yet the equation (16) is not entirely meaningless now, 
it gives the changing number of particles actually present. 


The mean number of particles of the kind s is obviously 


__ 1 dZ ___ IdlogZ 

hence, from (14) and (15), 

which confirms (16) with (5). This formula, for thg B.E. case 

(minus sign), has been mentioned in VIII (8.20), where it was 

obtained by a completely different consideration of Einstein's. 

In the same way, the average energy of the system is found to 


in agreement with (4). 

These are the fundamental formulae of quantum gases, derived 
from the general kinetic theory. They are to be found in A. Ph. 
Ch. VII, p. 197; in particular the fundamental formula (17) in 
5, p. 224, for B.E., and in 6, p. 228, for F.D. All further 
developments may be read there (or in any other of the 
many books dealing with the subject). I wish to conclude this 
presentation by giving the explicit formulae for monatomic 
gases, where the energy is = jt> 2 /2m and the summation over 
cells is to be replaced by an integration over the momentum 
space. The weight of a cell is found, by a simple quantum- 
mechanical consideration, for a single particle without spin 
(A. Ph. Ch. VII, 4, p. 215) to be 


Hence, introducing the integration variable 
one obtains from (17) and (18) 



which are the quantum generalizations of the formulae given in 
Appendix 14 and reduce to them for a -> oo. A detailed discus- 
sion would be outside the plan of this book. It need only be 
said that the F.D. statistics of electrons have been fully confirmed 
by the study -of the properties of metals (A. Ph. Ch. VII. 7, 
p. 229; 8, p. 232; 9, p. 235; 10, p. 236; A. XXX, p. 352). 

33. (IX. p. 116.) Quantum equations of motion 

At the end of Chapter VI, which deals with the kinetic theory 
of (dense) matter from the classical standpoint, the statistical 
derivation of the phenomenological hydro-thermal equations 
was mentioned and reference made to this later Appendix, which 
belongs to quantum theory. This was done to save space; for 
the classical derivation is essentially the same as that based on 
quantum theory, and one easily obtains it from the latter by a 
few simple rules. 

The first of these rules is, of course, the correspondence of the 

normalized commutator [a, j3] = - (a/? /?a) with the Poisson 

* lib 

bracket 8? 8a 8$ \ 

~ * 

. - , 

1 = 1 

if x (i) and p (i) are the position and momentum vectors on which 
a and /J depend. 

The second rule concerns the operator x> which in the text is 
described in words; expressed in mathematical symbols it is 

Xr .. = JJ dxWx<d'8(xM_x%.. (2) 

It has to be interpreted classically to mean 

Xa ... = JJdx(^<>.... (3) 

Thus the classical operation f d| (fl) corresponds to 

i.e. to substituting x (fl) for x (c)/ as stated in the text. 

In using the correspondence principle to proceed from classical 
to quantum mechanics a product aft may not be left unchanged 
unless a and j8 commute; in general one must replace aj8 by 

By applying these rules one can easily go over from quantum 


to classical formulae (and in many cases also vice versa). There- 
fore we give here only the quantum treatment. 

To derive the equation (9.48) from (9.46) with the help of 
(9.47), one proceeds by steps of which only the first need be 
given, as the following ones are precisely similar: One has: 

H N = tf w _ 1+ JLpWi+ JT tfW. . (4) 

Hence, applying the operation XN ^ (9-46), one obtains, using 

P W2 > ftr] + j 

The middle term on the right-hand side is 

. PN]- 

and vanishes on transformation to a surface integral, because 
there is no flow across the boundary at a large distance. Hence 
(5) reduces to the equation (9.48) with q = NI, which com- 
pletes the first step. The following steps are of the same pattern. 
In order to make the transition from the 'microscopic' equa- 
tions of motion (9.48) of the molecular clusters to the macro- 
scopic equations of hydrodynamics, one needs first to define the 
density and macroscopic velocity in terms of the molecular 
quantities. The generalized 'density' n q , which reduces to the 
ordinary number density % for q = 1, is obtained as a function 
of the positions x (1 >,..., x ( > by writing x (l '>' = x<*> (i = l,2,...,g) 
in the density matrix /> a (x, x'). The macroscopic velocity u^ 
for a molecule (i) in the cluster of q molecules whose positions are 
given is the average value of the quantity represented by the 

operator p (i) : 

m i 

-* < 

where the bracket {...} indicates the symmetrized product, as 
introduced above, and the subscript x' = x the diagonal 
elements of the matrix. 


By expressing (9.48) in the coordinate representation and 
writing x (t>) ' = x (i) , one obtains the equation of continuity 


= ~ 

,x')] x , =x = 0. 

Next, multiply (9.48) by the operator p (i) before and after, taking 
half the sum, and then write x<' = x< 0' = 1, 2,...,q). 

' s\ 

The left-iiand side evidently reduces to w^u^). One has 




and {p, [*<, P9 ]}^ K = 


{P (i) ,x, + i[^ +1) .p 9+1 ]}x- x = - J 

Hence, if a tetisor 1^ is defined by 

one has 

= 0. (8) 

By using (7) one obtains 


c\ n Q 

or, if d/dt is the convective derivative ~- + V u^. ^ , 

ct j^i dx (1) 

Hence (8) may be written in the form 

where p tfi) 

fl )} x ^ x -mn a <-)<> (11) 


here v<*> = I p(*>~i4*> f[ S(x^-x< / ) (12) 

is the relative molecular velocity referred to the visible motion. 

The equation (10) is the generalized equation of mdtion of the 
cluster of q molecules, which reduces to the ordinary equation of 
hydrodynamics when q = 1. 

p wi) , the generalized pressure tensor, is seen to consist of two 
parts W* and I (J ^ associated with the kinetic energy of motion 
and the potential energy between the molecules respectively. 

The diagonal element of the tensor k w is a multiple of the 
kinetic temperature T^ defined by 

The equation of energy transfer can be obtained inthe same way 
as the equation of motion by calculating the rate of change with 
time of 

34. (IX. p. 118.) Supraconductivity 

There exists a satisfactory phenomenological theory of supra- 
conductivity, mainly due to F. London; it is excellently pre- 
sented in a book by M. von Laue, where the literature can be 
found. (Theorie der Supraleitung: Springer, Berlin u. GOttingen, 

Many attempts to formulate an electronic theory have 
been made, without much success. Recently W. Heisenberg 


has published some papers (Z. f. Naturforschung, 2 a, p. 185, 
1947) which claim to explain the essential features of the pheno- 
menon. According to this theory every metal ought to be supra- 
conductive for sufficiently low temperatures. Actually the alkali 
metals which liave one 'free' electron are not supraconductive 
even at the lowest temperatures at present obtainable, and it is 
not very likely that a further decrease of temperature will change 
this. There are also theoretical objections against Heisenberg's 
method. % 

A different theory has been developed by my collaborator 
Mr. Kai Chia Cheng and myself, which connects supracon- 
ductivity with certain properties of the crystal lattice and pre- 
dicts correlations between structure and supraconductive state, 
which are confirmed by the facts (e.g. the behaviour of the alkali 
metals). The complete theory will be worked out in due course. 

35. (X. p. 124.) Economy of thinking 

The ideal of simplicity has found a materialistic expression in 
Ernst Mach's principle of economy in thought (Prinzip der 
Denk-Okonomie). He maintains that the purpose of theory in 
science is to economize our mental efforts. This formulation, 
often repeated by other authors, seems to me very objectionable. 
If we want to economize thinking the best way would be to stop 
thinking at all. A minimum principle like this has, as is well 
known to mathematicians, a meaning only if a constraining 
condition is added. We must first agree that we are confronted 
with the task not only of bringing some order into a vast expanse 
of accumulated experience but also of perpetually extending this 
experience by*research; then we shall readily consent that we 
would be lost without the utmost efficiency and clarity in think- 
ing. To replace these words by the expression 'economy of 
thinking' may have an appeal to engineers and others interested 
in practical applications, but hardly to those who enjoy thinking 
for no other purpose than to clarify a problem. 

36. (X. p. 127.) Concluding remarks 

I feel that any critical reference to philosophical literature 
ought to be based on quotations. Yet, as I have said before, my 
reading of philosophical books is sporadic and unsystematic, 
and what I say here is a mere general impression. A book which 


I have recently read with some care is E. Cassirer's Determinis- 
mus und Indeterminismua in der modernen Physik (Gdteborg, 
Elanders, 1937), which gives an excellent account of the situa- 
tion, not only in physics itself but also with regard to possible 
applications of the new physical ideas to other fields. There one 
finds references to and quotations from all great thinkers who have 
written about the problem. The last section containsCassirer's 
opinion on the ethical consequences of physical indeterminism 
which is essentially the same as that expressed by myself. I 
quote his words (translated from p. 259): 'From the significance 
of freedom, as a mere possibility limited by natural laws, 
there is no way to that "reality" of volition and freedom of 
decision with which ethics is concerned. To mistake the choice 
(Auswahl) which an electron, according to Bohr's theory has 
between different quantum orbits, with a choice (Wahl) in 
the ethical sense of this concept, would mean to become the 
victim of a purely linguistic equivocality. To speak of an ethical 
choice there must not only be different possibilities but a con- 
scious distinction between them and a conscious decision 
about them. To attribute such acts to an electron would be a 
gross relapse into a form of anthropomorphism. . . . ' Concerning 
the inverse problem whether the 'freedom' of the electron helps 
us to understand the freedom of volition he says this (p. 261): 
'It is of no avail whether causality in nature is regarded in 
the form of rigorous "dynamical" laws or of merely statistical 
laws. ... In neither way does there remain open an access to 
that sphere of "freedom" which is claimed by ethics'. 

My short survey of these difficult problems cannot be com- 
pared with Cassirer's deep and thorough study. Yet it is a 
satisfaction to me that he also sees the philosophical importance 
of quantum theory not so much in the question of indeterminism 
but in the possibility of several complementary perspectives or 
aspects in the description of the same phenomena as soon as 
different standpoints of meaning are taken. There is no unique 
image of our whole world of experience. 

This last Appendix, added after delivering the lectures, gives 
me the opportunity to express my thanks to those among my 
audience who came to me to discuss problems and to raise 
objections. One of these was directed against my expression 


'observational invariants'; it was said that the conception of 
invariant presupposes the existence of a group of transformations 
which is lacking in this case. I do not think that this is right. 
The problem is, of course, a psychological one; what I call 
'observational* in variants' corresponds roughly to the Gestalten 
of the psychologists. The essence of Oestalten theory is that the 
primary perceptions consist not in uncoordinated sense im- 
pressions but in total shapes or configurations which preserve 
their identity independently of their own movements and the 
changing standpoint of the observer. Now compare this with 
a mathematical example, say the definition of the group of 
rotations as those linear transformations of the coordinates 
x,y, z for which x 2 +y*+z 2 is invariant. The latter condition 
can be interpreted geometrically as postulating the invariance 
of the shape of spheres. Hence the group is defined by assuming 
the existence of a definite invariant configuration or Gestalt, not 
the other way round. The situation in psychology seems to me 
quite analogous, though much less precise. Yet I think that this 
analogy is of some help in understanding what we mean by real 
things in the flow of perceptions. 

Another objection was raised against my use of the expression 
'metaphysical' because of its association with speculative sys- 
tems of philosophy. I need hardly say that I do not like this kind 
of metaphysics, which pretends that there is a definite goal to be 
reached and often claims to have reached it. I am convinced 
that we are on a never-ending way ; on a good and enjoyable way, 
but far from any goal. Metaphysical systematization means 
formalization and petrification. Yet there are metaphysical 
problems, 'which cannot be disposed of by declaring them 
meaningless, or by calling them with other names, like epistemo- 
logy. For, as I have repeatedly said, they are 'beyond physics' 
indeed and demand an act of faith. We have to accept this fact 
to be honest. There are two objectionable types of believers: 
those who believe the incredible and those who believe that 
'belief must be discarded and replaced by 'the scientific method' . 
Between these two extremes on the right and the left there is 
enough scope for believing the reasonable and reasoning on 
sound beliefs. Faith, imagination, and intuition are decisive 
factors in the progress of science as in any other human activity. 



Absolute temperature, 38, 42, 48, 53, 

149, 157. 

Absorption, 82, 86. 
Accessibility, 144M5. 
Ackejrmann, 169. 
Adiabatics, 34-9, 147, 149-51. 
Adrian, A. !>., 125. 
Advanced potential, 26. 
Angular momentum, 86, 111, 182, 

184. % 

Antecedence, 9, 12, 15, 17,25-6, 30-2, 

44, 71-3, 102, 120, 124. 
Astronomy, 10-16. 
Atom, 17, 84. 
Atomic physics, 187. 
Atomistics, 46. 
Avogadro's number, 63, 176. 

Balmer, 85, 87. 

Bernoulli, D., 47. 

Binary encounters, 68. 

Biot, 138. 

Boer, de, 169. 

Bohr, 82, 85-7, 105, 127, 191, 208. 

Boltzmann, 53, 55, 56, 58-60, 62, 76, 

140, 163, 167, 182-3. 
Boltzmann's constant, 53, 158, 176. 

equation, 55, 68, 71, 165, 172, 174, 


/f-theorem, 57-9, 113. 

Born, 65, 169, 186, 188-9, 195, 207. 

Boscovich, 46. 

Bose, 112, 197. 

Bose-Einstein statistics, 113, 199-202. 

Boyle-Charles law, 48. 

Boyle's law, 47, 54, 59, 149, 152, 159. 

Broglie, de, 89-92. 

Brownian motion, 62-5, 73, 99, 170, 


Bucherer, 27. 
Biichner, 74. 

Caloric, 31. 
Calorimeter, 31. 

Canonical distribution, 60, 72, 112, 
113, 175, 187, 197. 

form, 18, 49, 96. 
Caratheodory, 38-9, 143, 146, 149. 
Carath6odory's principle, 39, 41, 42. 
Carnot cycle, 143. 

Casimir, H. B. G., 151. 

Cassirer, E., 208. 

Cauchy, 19, 21, 26, 44, 114, 134, 140. 

Cauchy's equation, 20, 58, 70, 141. 

theorem, 160. 

Causality, 3, 5-9, 17, 72, 76, 95, 101-3, 
120, 124, 126, 129. 

Cause, 47, 92, 101, 109, 120, 129. 
Cause-effect relation, 5, 15, 46-7, 71. 
Cavendish, 23. 
Chance, 3, 46-73. 83, 84, 92, 101, 103, 

109, 120, 123. 
Chapman, 56. 

and Cowling, 56, 58. 
Charge, 104. 

Chemical equilibrium, 146. 
Cheng, Kai Chia, 207. 
Clausius, 38. 

Collision cross-section, 55, 101, 109, 

integral, 56, 59, 68, 72, 163, 180, 


Colloids, 65. 

Commutation law, 89, 91, 189. 
Commutator, 94-7, 190, 193, 203. 
Conduction of heat, 54, 58, 65, 70, 114. 
Conservation of energy, 19, 119, 164, 


of mass, 20. 

of momentum, 164, 182. 
Constants of motion, 112. 
Contact forces, 21. 

Contiguity, 9, 12, 16, 17-30, 74, 103, 

120, 124, 140, 141. 
Continuity equation, 20, 24, 31, 49, 

57, 134, 137, 139, 205. 
Continuous media, 19-22, 134. 
Convective derivative, 21, 49, 137, 


Copernicus, 12. 

Corpuscular theory of light, 22. 
Correspondence principle, 87, 191, 


Cosmology, 10, 11. 
Coulomb forces, 23-4, 103, 180. 
Coulomb's law, 25, 85, 138. 
Cowling, see Chapman. 
Critical temperature, 117. 
Curvature of space, 28. 

Darwin, C. G., 54, 155, 160. 
Debye, 65. 
Decay, law of, 84. 
Degeneration of gases, 197. 

temperature, 118. 
Demokritos, 46. 
Density, 20, 34, 170, 204, 

function, 113. 

matrix (or operator), 106, 111, 193, 

199, 204. 

Dependence, 6, 8, 76, 102, 124. 
Descartes, 11, 16. 
Determinism, 3, 5-9, 17, 30, 92, 101, 

108, 110, 120, 122-4, 126. 



Dewar vessel, 35. 
Diffraction, 22, 106. 
Diffusion, 54, 58, 175. 
Dirac, 91, 92, 95, 113, 141, 189, 191. 
Dirac's 8-function, 60, 96, 195. 
Displacement current, 25. 
Distribution function, 50, 71, 96, 106, 

law of Bose-Einstein, 81. 

of Maxwell-Boltzmann, 614, 

57, 60, 78, 82, 112, 167, 181. 

Eckhart, 151. 

Economy of thinking, 207. 

Eddington, 141. 

Ehrenfest, 59, 86. 

Eigenfunction, 93-4, 98, 102, 105-6, 

Eigenvalue, 91, 93-4, 98, 100, 111-12, 

Einstein, 15, 27-30, 62-5, 75, 78, 

80-3, 88-90, 92, 100, 101, 112, 

116, 122-4, 141-3, 151, 170, 172, 

173, 175, 197, 202. 
Einstein's law, 75-6. 
Electromagnetic field, 22-6, 138. 

wave, 25. 

Electron, 84-92, 95, 103-5, 116-18, 
141, 199, 207, 208. 

spin, 87, 95, 104, 113. 
Emission, 82, 86. 
Energy and mass, 75. 

and relativity, 90. 
, density of, 29. 

, dissipation of, 113. 

in perturbation method, 100, 111. 

levels, 82-5, 87, 91, 198, 199. 

of atom, 159. 

of oscillator, 77-9. 

of system, 36, 47-8, 53, 147-8, 171- 


surface, 62, 170. 
Enskog, 56. 
Enthalpy, 148. 

Entropy and probability, 53, 57, 151, 
162, 172. 

and temperature, 116. 

, change of, 43-4, 71-2, 183, 186. 
- , definition of, 42, 113-14, 165. 
, establishment of, 38. 

from Caratheodory's theory, 149- 


in chemical equilibria, 148. 

of atom, 159. 

Equations of motion, 18, 21, 49, 57, 
95-6, 103, 116, 137, 182, 203-6. 

of state, 36, 48. 
Equipartition law, 77, 188. 
Erlanger Programm, 104. 
Ether, 19, 22, 74. 

Euler's theorem, 147-8. 
Excluded middle, 107. 
Exclusion principle, 87. 

Faraday, 23-4, 140. 

Fermi, 113, 197. 

Fermi-Dirac statistics, 113, 199-203. 

Field equations, 28-9. 

of force, 17. 

vector, 140. 
FitzGerald, 27. 
Fluctuation law, 79, 81. 
Fluctuations, 151, 17ty-6, 196. 
Fock, V. A., 29, 143. 
Fourier, 31. 

Fowler, R. H., 39, 54, 155, 160. 
Franck, 86. 

Free energy, 44, 61, 148, 159, 167, 

will, 3, 126-7. 
Frenkel, J., 65. 
Frequency, 88-90. 
Fresnel, 22. 
Fuchs, K., 169. 

Functional equation of quantum 
statistics, 112, 197. 

Galileo, 10-13/129. . 

Gamma-function, 155. 

Gas constant, 48. 

Gauss, 18, 46, 48, 133. 

Gauss's theorem, 134, 165. 

Geodesic, 29. 

Gerlach, 87. 

Oestalten theory, 209. 

Gibbs, 58, 60-2, 65, 66, 112, 149, 167. 

Godel, 108. 

Goeppert-Mayer, 169. 

Goudsmit, 87. 

Gravitation, 13, 17, 26-30, 124, 133, 

134, 141, 142. 
Green, H. S., 65, 68, 100, 115, 120 

169, 181, 183, .185-6, 195. 
Groot, de, 151. 

Hamilton, 18, 91, 191. 
Harmltonian as matrix, 89. 

as operator, 95. 

definition of, 18. 

in equations of motion, 49-51, 59, 

66, 103, 115. 

in perturbation method, 99, 110, 


in statistical mechanics, 167. 

of a particle, 102, 166, 176, 182. 

of oscillator, 77, 188. 

Heat, 31, 34, 36-7, 119, 147, 157. 
Heisenberg, 86, 88-9, 92, 94, 188-9, 

191, 206-7. 
Helium, 116-18. 



Helmholtz, 61. 
Hermitian operator, 93. 
Hertz^G., and Franck, J., 86. 
Hertz, H., 25. 
Hilbert, 56, 108. 
Hoffmann, B., 29, 143. 
Hooke's law, 21 , 
Huvgens, 22. 
Hyarogen atom, 86. 
Hydrothermal equations, 115, 118, 
119, 203. 

Ideal gas, 47. 50, 54, 112, 149, 159, 

171, 197.' 

Indeterminacy, 101-9, 119, 208. 
Induction, 6, 7, 14, 46, 64, 84. 
Infeld, L., 29, 143. 
Initial state, 48. 
Integral of motion, 98, 100, 110, 112. 

operator x , 66, 177, 181, 203-4. 
Integrating denominator, 40-1, 143- 


Interference, 22, 106. 
Inter -phenomena, 107. 
Irreversibility, 17, 32, 55, 57-9, 72, 

109, 151, 165, 183. 
Isotherms, 36, 47, 149. 

Jeans, 77, 81. 
Jordan, P., 188-9. 
Joule, 33-6. 

Kahn, B., 169. 
Kelvin, 38, 46. 
Kepler, 10, 12, 129, 132. 
Kepler's laws, 12-13, 129-32. 
Kinetic energy, 18, 50, 61, 80, 116, 
117, 206. 

theory, general, 64-70, 202. 

of gases, 46-58, 69, 151-3, 174. 

, quantum, 109-21. 

Kirkwood, 69, 176. 

Klein, F., JP04. 
Kohlrausch, 25, 140. 

Lagrange, 18. 
Lagrangian factor, 156. 
A-point, 117-18. 
Landau, 120. 
Laplace, 18, 30. 

differential operator, 22. 
Laue, M. v., 206. 
Levi-Civita, 28, 142. 
Light quantum, 82. 

Liouville's theorem, 49, 52, 58, 60, 66, 

68, 159, 181. 

Logic, three-valued, 107-8. 
London, F., 206. 
Lorentz, 27, 80. 

force, 141. 

Lorentz transformation, 27. 
Loschmidt, 58. 

Mach, E., 207. 

MacMillan, 169. 

Magnetic field, 138. 

Margenau, H., 143. 

Mass, gravitational, 15, 142. 

, inertial, 14, 15, 28, 75, 104, 142. 

Matrix, 83, 88-9, 91, 97-8, 100-1, 115. 

mechanics, 90-1, 188-9. 
Matter, 74, 83. 

Maxwell, 24-7, 50, 51, 54, 56, 62, 112, 

124, 138-41. 
Maxwell's equations, 24-6, 75, 124, 


functional equation, 153. 

tensions, 141. 
Mayer, J., 169. 
Mayer, R., 32. 
Mean free path, 55. 

Mechanical equivalent of heat, 32-3. 

Meixner, 151. 

Mendelssohn, 120. 

Meson, 95, 141, 199. 

Method of ignorance, 68. 

Miohels, 169. 

Michelson and Morley experiment, 27. 

Minkowski, 28, 142. 

Molecular chaos, 47, 48, 50, 152. 

Moll, 88. 

Momentum, 18, 29, 75, 86, 90, 91, 116, 

136, 152, 184. 
Montroll, 169. 
Multinomial theorem, 1 60. 
Multtplet, 88. 
Murphy, G. M., 143. 

Neumann, J. v., 109. 

Neutrino, 199. 

Neutron, 199. 

Newton, 10-30, 74, 80, 95, 103, 124, 

126, 129, 132, 133, 141. 
Non-commuting quantities, 92-6. 
Non-Euclidean geometry, 28. 
Non-linear transformation, 27, 142. 
Nucleon, 199. 
Nucleus, 84, 103, 199. 
Number density, 97-8. 

Objectivity, 124-5. 

Observational invariant, 125, 209. 

Oersted, 25. 

Onsager, L., 151. 

Operator, 91, 93-5, 97-8, 100, 188-9. 

Ornstein, 88. 

Oscillator, 77-9, 85-6, 116. 

Partition function, 54, 60, 61, 72, 
158-9, 162, 167, 171, 199. 



Pauli, 87, 141. 

Fault's exclusion principle, 199. 

Periodic system of elements, 87. 

Perrin, 63, 176. 

Perturbation, 99, 110, 193-6. 

Pfaffian equation, 38-41, 143-4, 147. 

Phase rule, 149. 

space, 49, 51. 

velocity, 22, 90. 
Photo-electric effect, 80-1. 
Photon, 80, 82, 90, 92, 101, 105, 107, 

112, 199. 
Planck, 76, 78-80, 82, 88-90, 100, 101, 


Planck's constant, 79, 85, 127, 142. 
Poincare", 27, 58, 62. 
Poisson bracket, 49, 96, 181, 191-3, 


Poisson's equation, 138. 
Potential, 14, 103, 138, 177. 

energy, 18, 61, 67, 95, 168, 180, 

183, 206. 

Pressure, 22, 23, 34, 47, 48, 53, 61, 
116, 118. 

of light, 75. 

tensor, 206. 
Priestley, 23. 
Prigogine, I., 151. 

Probability, 51, 56, 94, 102-3, 106, 
124, 174. 

and determinism, 48. 

and entropy, 151, 172. 

and irreversibility, 57. 

coefficient, 83. 

function, 4950. 

of distribution, 49-50, 52, 66-7, 98, 

100, 154, 160-3, 170, 178-80, 194. 

of energy, 79. 
, theory of, 46. 

wave, 105-7. 
Proton, 199. 
Ptolemy, 10. 

Quantum, 76, 80-2, 90, 

conditions, 86, 89. 

gas formulae, 202. 

mechanics, 19, 73, 76, 83, 86, 89, 

92-103, 107-9, 111-21, 123, 
188-93, 196. 

number, 87, 200. 

theory, 17, 76, 82, 84, 110, 116, 120, 

123, 124-7, 142, 197, 203-6. 
Quasi-periodicity, 58, 62, 169. 

Radial distribution function, 69. 
Radiation, 76-83, 88-9, 101, 112, 200. 
- density, 76-7, 83. 
Radio-active decay, 83, 101. 
Radio-activity, 84, 101. 
Rayleigh, 77, 81, 91, 171. 

Reaction velocity, 83. 

Reality, 103-4, 123, 125. 

Reichenbach, 107-8. 

Relativity, 14, 15, 17, 26-30, 74-5, 90, 

124, 141-3. 
Rest-mass, 75. 
Retarded potential, 6. 
Reversibility, 58, 71. 
, microscopic, 151. 
Ricci, 28, 142. 
Riemann, 28, 142. 
Ritz, 85. 
Rotator, 86. 
Russell, 108, 129. 
Rutherford, 83, 84. 

Savart, 138. 

Schrodinger, 89, 91-2, 189, 197. 

Self-energy, 109. 

Semi -permeable walls, 43, 146. 

Smoluchowski, M. v., 176. 

Soddy, 83. 

Sommerfeld, 86, 197. 

Specific heat, 44, 80, 116-19, 150, 159, 


Spectrum, 85, 87. 
Statistical equilibrium, 48, 50, 53, 60, 

66, 72, 111, 113-16, 119, 153, 

175, 197. 

mechanics, 50, 58-65, 71-3, 77-9, 

84, 155, 167. 

operator (ormatrix), 97-8, 100, 102. 

term, 67. 
Statistics, 46, 84. 

Steepest descent, 155, 161, 200. 

Stefan, 76. 

Stern, 87. 

Stirling's formula, 534, 154. 

Stokes, 176. 

Strain, 20, 34, 44, 140. 

tensor, 21, 70. 
Stress, 20, 34, 44, 140. 

tensor, 20-2, 29, &S, 76, 116, 135. 
Supra-conductivity, 117-18, 206-7. 
Surface forces, 134. 

Temperature, absolute, 38, 42, 48, 53, 
149, 157. 

and Brownian motion, 175. 

and heat, 31, 34. 
, critical, 117. 

degeneration, 118. 

, empirical, 36, 147, 153. 

function, 42, 44. 

, kinetic, 116, 118, 206. 

scale, 36. 

, thermodynamic, 116, 118. 
Tension, 20, 23, 134. 
Thermal energy, 117. 

equilibrium, 35-6, 53, 60. 



Thermal expansion, 116. 
Theijnodynamios, 17, 31-45, 53, 110, 

1*, 146, 148-9, 151. 
% , first law of, 33, 37, 118. 
, second law of, 36, 38, 53, 143, 157- 

8, 173. 

Thermometer, 3J, 35. 
Thomson, J. J., 85. 
TiAe, 16, 27, 32, 71. 
, flow of, 32. 
Tisza, 120. 

Transition probability, 101, 114. 
Tycho Brahe, 12, 132. 

Uhlenbeck, 8* 169. 

Uncertainty principle, 94, 96, 104-5, 

117, 189-91. 
Ursell, 61, 168, 169. 

Van der Waals, 54, 60, 61. 
Vector field, 134. 

Velocity distribution, 50, 51. 

of light, 25, 140. 

of sound, 149-51. 
Viscosity, 54, 68, 70, 117, 118, 176. 

Wave equation, 22, 91-2, 102, 140. 

function, 89, 102, 197. 

mechanics, 89, 91, 189. 

theory of light, 22. 
Weber, 25, 140. 
Wien, 76, 78-9, 81. 
Wiener, N., 188. 
Wilson, 86. 

X-rays, 84. 

Young, 22. 
Yukawa, 141. 
Yvon, 169. 

Zermelo, 58, 62, 169.